Exploring Assignments, Student Work, and 
Teacher Feedback in Reforming High Schools: 
2002-03 Data from Washington State 



fa 



fa 

tVvprt OS&e. r,cvi, V, fte 

» 'oi ^ rftr^^.160 

O " t ' 'J *>4 Wt ^FVl|U_^ 1 

uw utajft 

fa fiifc" lfStn % vyc^ 

|4tr>o da yt^ i£jw*3 > 

^ 1 AM 1 .AK u Vo* cP- c- 1 - Lflfl\\)d30®C<3 
A YfiOl^J quevd^toAS 

Viirn fa \tf\ 



-ifa \jflVfN- jWi |prt2a\f&L 

re AVorn & 'ivv 

f/) ttedecv^ff.. -Vo Wf '&*•*. 
isfe tXyvtt W. Aid u-: i-.is 

^\€ raft .VUY>+4iw;-, 

s ^ 

"cJjft l\ -\n^ ca^o£> o yri.'YV 
\yyfr V¥> c^ps ■* ^ '^vti.pf 

O^ .A wHle+tCS f=l^- V^>e 

"HVUfl 

m\ -st on , 

1fo[^ ™- cvffcsJv We&h ,.Vee_ 

iVmn "b'.-i c ■Ae. 



toi- 



A Report from the Evaluation of the Bill & Melinda Gates 
Foundation's National School District and Network Grants Program 



J anuary 2004 




AMfcKlCMl iNSim/ffi KJH KtHitlAKU i 



SRI International 




Contents 



Chapter 1: Rationale for Examining Assignments, Student Work, and Teacher Feedback 1 

Relationship to Prior Research on Authentic Intellectual Achievement 2 

Beginning our Work in Washington State 3 

Purpose of This Report 3 

Chapter 2: Our Measures of Rigor, Authenticity, and Quality 5 

Sample Assignments and Work in English/Language Arts 5 

Sample Assignments and Work in Mathematics 1 1 

Examining the Assignments, Student Work, and Feedback 17 

English/Language Arts Assignments 18 

English/Language Arts Student Work 18 

Mathematics Assignments 19 

Mathematics Student Work 20 

Teacher Feedback 21 

Conducting the Scoring 21 

Chapter 3: The Quality of Our Measures 23 

Scoring Assignments and Student Work 23 

Reliability of Scoring 23 

Scoring Data in English/Language Arts 24 

Examples of Low- and High-Scoring Assignments and Student Work in 

English/Language Arts 25 

Score Distributions for Assignments and Student Work in English/Language Arts 32 

Scores for Typical and Challenging Assignments and the Resulting Student Work in 

English/Language Arts 34 

Scoring Data in Mathematics 36 

Examples of Low- and High-Scoring Assignments and Student Work in Mathematics 36 

Score Distributions for Assignments and Student Work in Mathematics 43 

Scores for Typical and Challenging Assignments and the Resulting Student Work in 

Mathematics 45 

Relating these Data to Other Information on Teaching and Learning 47 

Chapter 4: Summary and Conclusions 49 

Chapter 5: Next Steps 53 

Creating Meaningful Reporting Scales 53 

Examining the Relationships among Assignments, Student Work, and Achievement Test 

Results 54 

Collecting Assignments and Work Nationwide 54 

Refining Scoring Rubrics and Procedures 55 

Exploring Options for Studying Non-Written Work 56 

References 57 

Technical Appendix A- 1 



l 




Figures: 

Figure 2.1: Typical Assignment in lOth-Grade English/Language Arts 5 

Figure 2.2: Student Work for a Typical Assignment in lOth-Grade 

English/Language Arts 7 

Figure 2.3: Challenging Assignment in lOth-Grade English/Language Arts 8 

Figure 2.4: Student Work for a Challenging Assignment in lOth-Grade 

English/Language Arts 9 

Figure 2.5: Typical Assignment with Student Work in lOth-Grade 

Mathematics 12 

Figure 2.6: Challenging Assignment in lOth-Grade Mathematics 13 

Figure 2.7: Student Work for a Challenging Assignment in lOth-Grade 

Mathematics 15 

Figure 3.1: Fow-Scoring Assignment and Fow-Scoring Student Work in 

English/Fanguage Arts 26 

Figure 3.2: High-Scoring Assignment in English/Fanguage Arts 27 

Figure 3.3: High-Scoring Student Work in English/Fanguage Arts 28 

Figure 3.4: Teacher Feedback in English/Fanguage Arts 31 

Figure 3.5: Score Distribution for Assignments in English/Fanguage Arts 32 

Figure 3.6: Score Distribution for Student Work in English/Fanguage Arts 33 

Figure 3.7 : Score Distribution for Teacher Feedback in English/Fanguage 

Arts 34 

Figure 3.8: Score Distribution for Typical and Challenging Assignments in 

English/Fanguage Arts 35 

Figure 3.9: Score Distribution for Student Work on Typical and Challenging 

Assignments in English/Fanguage Arts 36 

Figure 3.10: Fow-Scoring Assignment and Fow-Scoring Student Work in 

Mathematics 37 

Figure 3.11: High-Scoring Assignment and High-Scoring Student Work in 

Mathematics 39 

Figure 3.12: Teacher Feedback in Mathematics 42 

Figure 3.13: Score Distribution for Mathematics Assignments 43 

Figure 3.14: Score Distribution for Student Work in Mathematics 44 

Figure 3.15: Score Distribution for Teacher Feedback in Mathematics 45 

Figure 3.16: Score Distribution for Typical and Challenging Mathematics 

Assignments 46 

Figure 3.17: Score Distribution for Student Work on Typical and Challenging 

Assignments in Mathematics 47 



ii 




Chapter 1 : Rationale for Examining Assignments, 
Student Work, and Teacher Feedback 



Leaders at the Bill and Melinda Gates Foundation have dedicated 
themselves and a substantial portion of their education portfolio to 
improving American high schools. In particular, they seek to reduce 
inequities in the educational experiences of historically underserved teens. 
Foundation officials want to help convert large, troubled high schools into 
small learning communities where all students excel. Additionally, they 
want to help create new small schools that replicate promising high school 
models. Reformers in foundation- supported schools are working to create 
learning environments that are personalized, authentic, and rigorous; that 
prompt students to take responsibility for learning, make choices, and do 
high-quality wor k; and that are linked to the broader communit y and real- 
world concerns (Em ://www.gatesfoundation.org/Gates/Grants| ). 

Researchers at the American Institutes for Research (AIR) and SRI 
International are studying these efforts. We are working with foundation 
officials and reformers across the country to study high school change and 
what it takes to improve teaching and learning. We are examining the 
extent to which foundation-supported schools adopt elements of effective 
schooling and show better, more equitable outcomes for students. We also 
are investigating the factors that promote or impede school change and its 
sustained success. 

We are collecting a wide range of quantitative and qualitative data in 
foundation-supported schools. We are observing classrooms, interviewing 
teachers and other school leaders, and talking to students about what they 
do. We also are interviewing school district leaders and staff in the 
organizations charged with helping schools reform. We are collecting 
quantitative data through surveys administered to principals, teachers, and 
students and we are collecting achievement test data. We are following 
foundation-supported schools over time and comparing their activities and 
outcomes to those of conventional high schools nearby. 

In addition to this work, we are taking a careful look at teaching and 
learning. For a subset of schools in the larger evaluation, we are working 
with teachers to paint a detailed picture of instruction and of students’ 
academic work. We are trying to determine whether students in 
foundation-supported schools are exposed to challenging learning 
opportunities and whether challenging learning opportunities open the 
door to intellectually complex student work. We are studying these 
questions by collecting samples of the assignments students tackle and the 



1 




work that they produce. We are examining the rigor and authenticity of 
assignments and the quality of the resulting student work. We are coupling 
the data with information from jurisdiction-sponsored standardized tests 
and data from the broader evaluation. 

We are collecting the assignments and student work from 
English/language arts and mathematics teachers in 24 foundation- 
supported schools over timewWe will use the data to show how teaching 
and learning evolve as reformers continue their work. We also will 
contrast these data to assignments and student work for conventional high 
schools in the same districts. 

Relationship to Prior Research on Authentic Intellectual 
Achievement 

This study builds on research conducted in the Chicago Public Schools by 
Fred Newmann, Tony Bryk, and their colleagues (Newmann, Lopez, & 
Bryk, 1998; Bryk, Nagaoka, & Newmann, 2000; Newmann, Bryk, & 
Nagaoka, 2001). Their research in Chicago elementary and middle schools 
examined students’ opportunities to construct knowledge, communicate 
clearly and well, do work with authentic purposes, and use language and 
mathematics conventions accurately and effectively. Their work suggests 
that assignments that demand higher-order thinking skills, deep 
understanding of content, elaborated communication, and activities that 
are similar to real-world tasks elicit work that is intellectually more 
complex from students. 

Our research follows their methods and builds on their measurement 
model for assignments and student work by extending their scoring criteria 
to high school assignments and student work. In addition, we are studying 
two aspects of teaching and learning not examined by the Chicago 
research. We also are studying: 

• The choices that students make about what they will study and 
how they will learn. 

• The quality of teacher feedback on student work. 



1 See the Technical Appendix of this report for details on our school sampling plan. 

2 We will collect assignments and work from eight conventional high schools that offer 
useful comparisons for the foundation-supported schools. 



2 




Beginning our Work in Washington State 

We began our nationwide study of teaching and learning in foundation- 
supported schools with a pilot study in Washington State. We will use this 
experience to test our methods and measures in high school settings and to 
strengthen the work before expanding our data collection. 

This report describes our pilot work in 2002-03 with eight large 
Washington high schools planning to begin conversion to small learning 
communities in 2003-04. We will return to these schools in 2004-05 to 
see whether and how teaching and learning have changed. 

In 2003-04 our work will move beyond Washington State across the 
country to other large high schools planning to convert to small schools 
and to several new small high schools. As with the schools in Washington, 
we will follow these schools over time to see how teaching and learning 
evolve. We also will contrast their work with the efforts of teachers and 
students in conventional high schools. 

Purpose of This Report 

This report describes our measures of the rigor and authenticity of 
assignments, the quality of student work, and the utility of teacher 
feedback. It details our data collection methods, gives examples of the 
assignments and work we gathered, describes their scoring, and discusses 
our results. It examines the quality of the measures and makes suggestions 
for improving them and our methods moving forward. 



3 




Chapter 2: Our Measures of Rigor, Authenticity, 

and Quality 



We began our work in Washington State by enlisting the participation of 
English/language arts and mathematics teachers in eight large high schools 
scheduled to undergo conversion in 2003-04. In 2002-03, we asked 24 
lOth-grade English/language arts and 24 lOth-grade mathematics teachers 
in the eight schools for copies of their assignments and student work. We 
asked teachers to provide us with eight of their assignments over the 
course of the school year and for the work of 12 randomly selected 
students in response to three of the eight assignments. We asked teachers 
for four assignments that were typical of their students’ day-to-day 
activities and for four assignments that challenged students to show what 
they knew and could do at high levels. For each assignment, we also asked 
for descriptions of teaching objectives, teaching resources, and assessment 
goals. ^ 

Sample Assignments and Work in English/Language Arts 

Examples of the typical and challenging assignments we gathered and the 
work that resulted from them in English/language arts are shown in 
Figures 2.1 through 2.4. Figure 2.1 provides an example of an assignment 
described as typical of students’ day-to-day activities by one of the 
participating teachers. 

Figure 2.1: Typical Assignment in lOth-Grade English/Language Arts 

Dandelion Wine 

Using pages 1 through 32 in the novel Dandelion Wine, answer the 
following questions in complete sentences. Use a separate sheet of lined 
paper so you have plenty of room to write. 

1. How does Doug use his imagination to turn an ordinary experience 
into a magical one? 

2. Why does their father take Doug and Tom to the forest? 

3. Does Doug seem to share his father’s respect and love for nature? 

Give an example from the book to support what you decide. 

4. What do you think the “thing” that Doug feels in the woods turns out 
to be? 



3 See the Technical Appendix of this report for details on the sampling and data 
collection procedures for Washington State. 



5 



Figure 2.1: Typical Assignment in lOth-Grade English/Language Arts 
( concluded ) 



5. What does Doug convince Mr. Sanderson to let him have? Why does 
he want it so badly? 

6. What does Doug decide to keep track of? What good will it be to him 
later as an author? 

7. Describe Tom, Doug’s younger brother. 

8. How do the porches of summer cause Green Town lifestyles to 
change? 

9. Do you agree that the natural world will win in the end — will human 
beings’ gradual takeover of the wilderness eventually lead to human 
beings’ own extinction? 



A review of this assignment reveals that students can successfully 
complete most of its requirements by summarizing or paraphrasing 
information from the novel. Dandelion Wine. Little generation or 
exploration of new ideas is required to answer most of the questions. The 
assignment specifies the content of student work and the way that mastery 
should be demonstrated. Students do not need to write extensively in 
response, and the assignment has little application beyond the classroom. 

Figure 2.2, shown next, provides an example of student work that 
responds to the Dandelion Wine assignment. 



6 



Figure 2.2: Student Work for a Typical Assignment in lOth-Grade 



English/Language Arts 




The student whose work is shown in Figure 2.2 responded fairly 
successfully to the Dandelion Wine assignment. As just mentioned, 
however, the assignment called for very little original or elaborate 
communication. The studcnLrespondcd briefly, primarily by recounting 
information from the novel. 



4 The teacher feedback on this and other student work samples is discussed later in the 
chapter. 



7 



In Figure 2.3 an example is shown of an assignment described as 
intellectually challenging for students by the teacher who provided it. 

Figure 2.3: Challenging Assignment in lOth-Grade English/Language 
Arts 

Psychiatrist Writing Assignment 

You are to write a two- to three-page paper about Holden Caulfield from 
Catcher in the Rye. Write the paper as if you are Holden’s psychiatrist. 
You are to choose three things about Holden’s personality to discuss, then 
analyze those three things, and write what you think about them. Try and 
relate to Holden — identify what is wrong with his thinking. 



This assignment calls on students to identify the character traits of interest 
to them; move beyond the literal meaning of the text to analysis and 
evaluation of Holden’s personality; and support their arguments with 
detail, illustrations, or reasons. By specifying that three traits be examined, 
the assignment prompts extended writing. 

Figure 2.4 provides an example of student work responding to the 
Psychiatrist Writing assignment. 



8 



Figure 2.4: Student Work for a Challenging Assignment in lOth-Grade 
English/Language Arts 



-iS- 1 -'i l-l24K , J-C3-2-3-5&4 



Tlio Catcher in the Rye 

"Good evening HefcJcn! Im very gled that you could make this last meeting of ours, 

you do knew Im leaving for retirement. I have been your psychiatnE: for 
SC) me time raw ttetien and! t think its time (0 discuss some important matters, t think [ 
should start off by saying Holden that you in* a very courageous, boy for saying and 
doing things that yon Tee* aw ngliL for ytm and not for anyone else- While uLliere never 
say what they feel Cw never do Anything For themselves. That takes guts. But I vrauki 
atso like to discuss Some Orrt» esues I've noticed about you. JuSi a few suggestions J 
have to help you Id your situation and to help you understand where Im cnrmng from, 
TO make this easy For you and me ( placed three words that E think best describe you, 
and with these three words- lm going to tWlfle each one to help you better understand 
what exactly they mean. Okay? lets give it a vrfnri. 

First word-.- 

In <5CCatem: This is the very fires word 1 picked because lb the first vibe that I 
gert. from you. Holden Tve noticed you always say a wOrti called phony, and I we that 
you apply this to manly people. You always state how someone is phony, or then 1 Gdrtig 
something phony, and their looking phony, and well Maicatty just phony, [k/l Holden. 
tSd it ever occur to you that m&ybe (hdr frying to be nice? That Just maybe their trying 
to get to know you more, maybe wkteretand you a little better? It seems that your so 
tetter to afttf of peoffe Mjybc IF you just toed to un0ecst#sd UWm you would 
understand where their coming Pram, and 'yog wcmfidnl think their so what you call 
phony. Put tSCiVt £et me wrong . Im not spying that you are a vary sectored person 
cause also Imrn what I see there are certain people drat you do rare about.. Ffr 
example your brother Allie. It seems you carSd d whole lot tor him. and When he 
passed away you kind of died With him It hurt you. You stated how fltritgmt he was 
and how lie never got -mad at anyone. You also car#? a whole kit flbuut your sister 
Phoebe. You say how she Unrigs m straight A’s and & just great to talk to. But ttoWen 
these people are young, GouM it oe ttat maybe you still have a mind of a child 5 K&yte 
your not ready to bo on ynur own end make dedsions that regard you life. You might 
thwA you are ready to Face tfhe real wodd but if you keep isplatanQ people to tprvg to 
be- very hard out there- Its going to scorn that the only sane person in the wodd Is yog, 
5c may tie after (tils meeting you could fry ngf being SO harsh and maybe not having a 
judgment right away. 



t 



9 



Figure 2.4: Student Work for a Challenging Assignment in lOth-Grade 
English/Language Arts ( continued ) 



My second word.,,* 

intimdated Now I don't think yoy^ndmidiatietJ by ciliot of people but certain 
ones. For instance you tod me thal en your date with £% you guys went to go see 
The Lunls, and on this date Sally saw someone Site knew, and he came and talked to 
her. Vgm stated that 1 should have seen the way he hugged her and the way they said 
Mflllu to each other. New ifolden, was rt reafty as bad as It seemed cr perhaps someone 
was a little* Jealous? I thm* you really liked Sally considering the fact of tow good you 
toW me she looked. But this isn't the only time I recall you getting! jealous, r happen to 
remember the time you idd mo when 'your roommate at Pencey Prep named Stradlater 
had a- date with a girl named Jar*. It also seemed you really Bke the girl toco. White to 
was gone you kept ttwikjing about what they were dong and what was happening!, and 
when Stradlater came bark you were raa*y eager to know what had happen. So eager 
dart you and Strad taler got In a Fight. ] think this stows that this girt was very 
Important to ypu- See Holden by being in isolation people wil never know who you like. 
See if StradJater would have known you liked her maybe the whole date wouttnt haw 
never happened. But also maybe iF you would have told Jane how yuu fait to would! 
haw never gone on the date erthar My point Hofclen is that people cani mad your 
mind By kitting people in even if they are phony they'll cart Being mbmidatad is 
actually a good thing. Its means you do haw feelings and care about more people chan 
Just your brother Allfe and Sister Phoebe, 

Frialty my third word... 

Confident: res Itokien ham to befceue [ picked this word alter all the rr»p | have 
just given you. But Its true and its you. You seem pretty sure oF yourself even after 
getting kicked out of stool and experiencing pertain things. Uke your First encounter 
wim a prostitute, mat doesn't happen In everyday life Holden Bui thafs not why l 
peked this worn. You see when, you got kicked out of Pency Ptep you decided to go 
see your teacher Mr.5pencer and to listen to what he tod to say A teacher whn even 
flunked 'you You see to me that shows you have confidence in ynuretf even after 
everything that's happen to you. Not allot ur people hare that. Some would say Damn, 
what do I do now? Some might man cry, But you, you just delt with It. and that Holden 
Is truly a gift. 

So to wrap up our fcasl meeting r would like to Led you I hope you take my words 1 Intve 
given you and use them wisely. My biggest tear in Mb are people, cause tor liable of 
anything. Bui I know no matter what, ] have to deal with them. Just like you HcWen. 
Whether you like it ur not, phony or intelligent you have to deal wrth them. All oF ttom, 
Im sure you can see hew my words 1 picked for you conjoin altogether . See Holden 
teng In isolation 'S not helping very much hrs your life, Your always negative. Its okay to 
smile once in awhile. But Its. also okay to rave a crappy day once in awhile too, took 



10 



Figure 2.4: Student Work for a Challenging Assignment in lOth-Grade 
English/Language Arts ( concluded ) 



.I u_i -fii-: mu<M)J-3-3-S&4 



Hdtfitfn your ymaig Enjoy ft while you on.. Ouic tifta wij* you » grow Lip ID De a Md 
and grumpy ulc man 1 5 0 take rare of yourself Holden. Oh and Hofcten you can see Mrs. 
Cacfe on your way out If your interred in finding a few psychiatrist. It was e treasure 
hflwmg you. Hopefully all goes well Bye Holden. " 







O Y*&& 



The reader will note that this student’s response went beyond the 
information in the novel to formulate and test theses about Holden’s 
psyche. In this work, the student analyzed and evaluated relevant 
information and provided evidence for assertions. The writing is 
sufficiently developed, coherent, and well organized. Though there are 
some errors in spelling and usage, they present no problem for 
understanding the student’s meaning. 



Sample Assignments and Work in Mathematics 

Examples of the typical and challenging assignments we gathered and the 
work that resulted from them in mathematics are shown in Figures 2.5 
through 2.7. Figure 2.5 gives an examples of an assignment described as 
typical of students’ day-to-day activity by the teacher who provided it. 



11 



Figure 2.5: Typical Assignment with Student Work in lOth-Grade 
Mathematics 



Detecdvc';; Dozen 








In the fotUwrlflg d'.agrann, the measure of ere AC Is 14 D , AC - iC - 10" 
and the radius of the terge* CGrclc is 6"., &rti segment PE ptoses. through the 
center of the large circle. It is your Job as a math detective to investigate the 
situation and write down \'l other meaEurerie-nts/ calculations that you can 
deduce from the given clues. For each Tact, you need us explain in detail how 
you calculated each answer Sic stro tn Include the geWnetry fastis^ that you 
applied- Veu tiay use the Cart of this sleet IT you do rot have encush 

rwL -uv^Hr 



- 



+** ^ is P* 



rviSv J ' *tdt ~\4 
ecS 

^-1^350* 






\ 



c*!-^ c 

v v*i 4ajr"* 






He 



5 f i ; 4 

fj 

Y\ A Sa 1 w* 

,,. £ti !L 7 3}C,^ 

, >4 «* «»-;£££• sf •* 

«j * *■* 6-jwi -* ~ 

4) AC&A 14^1*4** In.^k- 8«^ ^ 

diJci it fl-rt ' ir C^l*. 

?\ ktUtJt t£M QAc fffcjj ** ^ 

y CstTT^ 1 *- 

0 Tk " ‘ 

q 1 ) £.6 ; 5 :*. 

iq^I At C'lrtlti. 

^ (T^ frA UM p'o ; EV* 

q> at sSf - ^ V ^ J iri5w ' r* ^ H 't^ 1 11 



5: 



'(|T fW ^4-K 

C,*} tV **_ ^ T» #*. r-^'T 

- Ar ir f J A- P.H p 5 ^ 



tv 

if eft 



To complete this assignment correctly, students have to apply their 
knowledge of basic geometry facts (e.g., area of a triangle and circle, facts 
associated with chords and tangents of circles, etc.). Students also are 
required to demonstrate procedural knowledge and apply some problem- 
solving strategies to a fairly routine problem context. The assignment, 
however, does not provide an opportunity for students to demonstrate 
understanding of the underlying principles. Students are required to 
substantiate each measurement or calculation with a basic rule or fact of 
geometry. 



12 



The student work that is shown in Figure 2.5 demonstrates successful 
application of relevant geometry facts with only a few minor procedural 
errors. In general, the student supported each of her answers with a 
reasonably complete explanation by stating a basic rule of geometry. The 
work itself provides little opportunity to judge the student’s conceptual 
understanding of underlying principles, nor does it provide any indication 
of the student’s problem-solving and reasoning abilities. 

Figures 2.6 and 2.7 provide examples of a challenging mathematics 
assignment and a sample of student work responding to it. 

Figure 2.6: Challenging Assignment in 10th- Grade Mathematics 

Just Count the Pegs 

Freddie Short has a new shortcut. He has a formula to find the area of any 
polygon on the geoboard that has no pegs in the interior. His formula is like a 
rule for an In-Out table in which the In is the number of pegs on the boundary, 
and the Out is the area of the figure. 

Sally Shorter says she has a shortcut for any geoboard polygon with exactly four 
pegs on the boundary. All you have to tell her is how many pegs it has in the 
interior, and she can use her formula to find the area immediately. 

Frashy Shortest says she has the best formula yet. If you make any polygon on 
the geoboard and tell her both the number of pegs in the interior and the number 
of pegs on the boundary, her formula will give you the area in a flash! 

Your goal in this POW (Problem of the Week) is to find Frashy’ s 
“superformula,” but you might begin with her friends’ more specialized 
formulas. Here are some suggestions about how to proceed. 

1 . Begin by trying to find Freddie’s formula and some variations, as described in 
Questions la through Id. 

a. Find a formula for the area of polygons with no pegs in the interior. Your 

formula should use the number of pegs on the boundary as the In and 
should give you the area as the Out. Make specific examples on the 
geoboard to get data for your table. 

b. Find a different formula that works for polygons with exactly one peg in 
the interior. Again, use the number of pegs on the boundary as the In and 
the area as the Out. 

c. Pick a number bigger than 1, and find a formula for the area of polygons 

with that number of pegs in the interior. 

d. Do more cases like Question lc. 



13 



Figure 2.6: Challenging Assignment in lOth-Grade Mathematics 
( concluded ) 

2. Find Sally’s formula and others like it, as described in Questions 2a through 
2c. 

a. Find a formula for the area of polygons with exactly four pegs on the 

boundary. Your formula should use the number of pegs in the interior as 
the In and should give you the area as the Out. 

b. Pick a number other than 4, and find a formula for the area of polygons 
with that number of pegs on the boundary. Again, use the number of 
pegs in the interior as the In and the area as the Out. 

c. Do more cases like Question 2b. 

When you have finished work on Questions 1 and 2, look for a superformula that 
works for all figures. Your formula should have two inputs — the number of pegs 
in the interior and the number of pegs on the boundary — and the output should be 
the area of the figure. 

Try to be as flashy as Frashy! 

3. Write-up 

1 . Problem statement 

2. Process: Explain what methods you used to come up with your formulas. 

3. Solution: Give all the formulas you found. 

4. Evaluation 

5. Self-assessment 



To complete this assignment successfully, students must demonstrate 
some conceptual understanding of area and be able to generalize from 
specific cases. There are two important mathematical ideas. In addition, 
the assignment requires students to engage in fairly substantial problem 
solving by asking them to generate models, test solutions, and reflect on 
their problem-solving strategies in writing. Students are asked to show 
their work and to support their solutions with written explanations. This 
assignment provides some guidance on the components students need to 
include for successful completion of the assignment. 

Student work responding to the Just Count the Pegs assignment appears in 
Figure 2.7. 



14 



Figure 2.7: Student Work for a Challenging Assignment in lOth-Grade 
Mathematics 



1 Problem Stafemenl. 

Frrafcto Shod has s new shortfall to rigdra cut ri-u formula >.□ Find id? mt-.a iu a 
polygon on She geoboard dial has hu pays efl Ihe ihnei pari or I! Tire formula Is 
like n n>fc Fry an in-out table Which Ihe wns (he number or pegs on Ihe 
boundary and the oof is Ihe area d Ihe (ipufft Sully 5ftorta also lias a shorfM 
but hors is fa* any soyjc-i on the oeoboard ihol lias exactly lour pegs on Ihe 
boundary All you need to do (or hei is tell !*r l ow itiahy ol Ihe pegs me In (he 

f irmer pad qJ iho polygon She can use her 1ormuia lo find Ihe are or it quick!? 
r rushy Ehnriijsl flays she has a belter formula Ihen bold ol them tl yuo liana 
Bfty kfnd or pnlygnn a* you nnod lo da « tell her how many pegs are on the inner 
part of the polygon 3rd the number of peps -n ihe boundary, her forni^ula will 
also give you Ihe answer quickly "Hi* yoflt in ctw PQW is to fmd Freshes fomiuia 
but first you need to start with her bends formulas 

3. Process 

t.p/ Tho first ihinp that I dd -was I Uiea out each or Kie shortcut . Fust shortcut I 
did was FrmdriirV Whal I did was l made my own geobaard and pul different 
ehurnis qn|n rl Thpr I made nn in-out table. LooKing at the difersnt polygons 
lhal l had mode l started lo -pm ihpm onto (he in-out table I pul the numiw of 
pegs on Ihe boundary on the Arrand Ihe area of ti-rfr I gore in Ihe Jlrr I looked (or a 
iqie but I could nod find one My In-oul laUle and my polygons locked like Ihls. 



B.J mem i mpnn pery^orss wnn onry one peg in me minnio ir Ana | pjt mwm (in 
the In-oul fobte like I djd .ihovtr II luokncl like this 



CJ i (h«n p^ked a number and the number r pickad was A I cut it into Ihn 
kymuli! .Jnd SO A wixild pc Iho jn and d"*yr» divid'd hy 7 ivjiiat ? which is Hie uOl 
So, A is the m and 3 is the oof Then I picked the number 5 and pul n *i!d die 
formula. Five divided by two equals two and a Hair The in woutl be 5 and Ihe 





15 



Figure 2.7: Student Work for a Challenging Assignment in lOth-Grade 
Mathematics ( continued ) 



out would tie 2 ' •! 



2.3} Whit's I did ‘Mas Iry arid find dil J Hi hiiI prilygms wti"h hail {inly 4 sides and 
h«d unt peg in Ihe middle The only one Ihal I found kx*dd like lt*s 




And my in-otfl Infete looks r ^e 1h<s. 

141 QU3 

j 2- 

Ths formula lhal I came up wrth h Ihe in divided by 2 equal? the cn/t 

EL) Aiiojher number i besides luui j fhst I um pick to puL into the lotmula would 
be 6 And W then purling 0 in Ihe formula I wuuld grf 4 . fi divided by 2 . The Jin 
the re Fa to would be eight and Iheor/f would bn 4 Then I picked fhe number IB 
and put d 'olo the fwmula. 16 divided by 2 equals 9 The in mwkJ be- 16 and It*; 
mrl would bo 9 

Frastry’s ‘’ 51 / 0 ^ 04 ™^".' 



The fcmiuSB that 1 louJbd h, multiply Lite number at pegs in. tne middle of the 
polygon by the boundary T hen yuu have your sooion I eke the soluton 3 rd 
sublrael Iftal by the number Ihe pegs in Ihe middle -of the polygon. Take thai 
so ..inn i end divide il by the ATrer pegs. You then Have andthef e-oIltUo". You 
iner- take that solution end gublreol if by Ihree An example: The in is 2 {number 
■or peys m the nfcddlej and IQ ; boundary i Ihe oaf is fi. ID k 2 : = 20 - 2 = 16 I 2 = 9 
‘ 3 = R 



The aMJut table I had looked likf It..', 






I fu* 







fsT^sfipes'- 

/ 



R 



3 . Soi'ulrufii . 



r rj=d<trts iViurl and SaiFy hhint -r's forrifula was Ihe in divaed by 2 equals Ihn penf 
And Frnshy Shortest was tile 

piullipfy llw nufCifcti of In ihe- middle ul i he polygon by the boundary. Then 
you kjTvg your solution Take Ihe snlutiuh and aufelmu! that by Shu number tlw 
pcqs t, rhn middle -of rho polygon T;ik<? L':li solution and divide it by thE inner 
pegs You then have enolher aolulUm You |h«n Inku that mini ,n and ailbtraci it 



16 



Figure 2.7: Student Work for a Challenging Assignment in lOth-Grade 
Mathematics ( concluded ) 



by three. 



4 • Evafulrmr 



y 



I icmncJ 1hi$ problem Srtfciatkinally wwlh-imll fcocausc I IliuughL IL rtat a good 
probiem anq it kepi you Irnnklng on hen# you can do ll anil l think Itiqt \ learned 
from it! I don' think that you could c*vjir<jft bi$ problem to make in better I think it 
Is good enough as ■> is 1 think in a wary Shatttiis problem was loo hard beeauee rt 
took a roa!'y Swig bme and you had lo think really hard. It seemed like buerylime 
you thought diet you -got 11 right Ihw you figured something c+oo out which made 
d wrong I didnl like working on Ihs oroPxxti Mcauce n look too long 



5. Sett-assessment. 

/ I think :het I stuMa get a! lessl ah A- be 63 la* I think Ihdt I did irery well on shia 
and even though aome of Ihe Ihmgs. may not tie corred ! teed wety hard to do H 



The student work shown in Figure 2.7 demonstrates an understanding of 
the concept of area with some minor misconceptions. The student was 
fairly successful at extracting general rules from the patterns that emerged 
from geoboard figures and the data in the In-Out tables. The work 
demonstrates an appropriate use of problem-solving strategies but is not 
entirely successful. For example, although the work indicates a clear 
solution path, the path does not lead to the desired “superformula.” The 
work also contains some major procedural errors. Finally, the work 
communicates a reasonably complete explanation of the problem 
statement, process, and solution; however, the explanation is somewhat 
unclear with regard to finding the “superformula.” 

Examining the Assignments, Student Work, and Feedback 

At the end of the school year, we hired and trained experienced high 
school English/language arts and mathematics teachers to examine the 
assignments and student work that we gathered and to rate them by using 
scoring rubrics that expanded on the Chicago Authentic Intellectual 
Achievement scoring rubrics. Our rubrics examined students’ 
opportunities to construct knowledge, communicate clearly and well, do 
work with authentic purposes, participate in decision making about 
learning activities, use language and mathematics conventions accurately 
and effectively, and refine and improve their work. 



17 



Enqlish/Lanquaqe Arts Assignments 



More specifically, in English/language arts, the scoring rubrics for 
assignments examined the following four criteria: 

• Construction of Knowledge . Scorers examined the extent to which 
assignments called for student work that moved beyond the mere 
reproduction of information to the construction of knowledge. 
Assignments that emphasized construction of knowledge required 
students to do more than summarize or paraphrase information 
they had read, heard, or viewed; these assignments required 
students to create or explore ideas that were new to them. 

• Elaborated Communication. Assignments that emphasized 
elaborated communication required extended writing and asked 
students to make assertions and support them with evidence. 

• Authentic Audiences. Assignments that had authentic audiences 
fulfilled purposes other than merely earning course credit and had 
audiences other than the teacher as grader. These assignments 
asked students to consider the concerns of and present their work 
to authentic audiences. 

• Student Involvement in Crafting Assignments. Scorers looked for 
evidence that students were invited to make choices about what 
they would study and how they would learn. Scorers also looked 
for teachers’ guidance on how students could meet instructional 
goals. 

Enqlish/Lanquaqe Arts Student Work 

The scoring rubrics for student work in English/language arts followed 
some of the same criteria used for assignments; they examined three 
features of English/language arts work: 

• Construction of Knowledge. Scorers examined student work for the 
degree to which it moved beyond the reproduction of information 
to the construction of knowledge. Work that demonstrated 
construction of knowledge did more than summarize or paraphrase 
information students had read, heard, or viewed; it showed that 
students created or explored ideas that were new to them. 

• Elaborated Communication. Scorers also examined the extent to 
which students demonstrated elaborated communication through 
extended writing that made an assertion and then supported it with 
evidence. This rubric also examined the extent to which student 
writing was sufficiently developed, coherent, and well organized. 

• Effective Use of Language Conventions and Resources. The final 
student work rubric examined the extent to which students 



18 




demonstrated proficient use of language conventions and language 
resources. This rubric looked for spelling, vocabulary, grammar, 
and punctuation that were appropriate for lOth-grade work; it also 
looked for artistic use of language resources, including diction, 
syntax, imagery, and figurative language. 

Mathematics Assignments 

In mathematics, the scoring rubrics for assignments examined five sets of 
criteria: 

• Important Mathematics Content. Scorers examined the extent to 
which assignments called for student work demonstrating deep 
conceptual understanding in one or more of the important ideas in 
mathematics. These important ideas refer to the large and unifying 
ideas that help link smaller pieces of mathematics knowledge, that 
undergird procedural skills, and that connect mathematics within 
and between content domains. Assignments should provide a key 
purpose for learning mathematics and should serve as organizing 
ideas for instruction. Among the important ideas that lOth-grade 
assignments address are chance, dimension, change and growth, 
transformation, interrelationships, translation of problems from one 
language to another, proportionality, and function and recursion. In 
addition, critical mathematical processes that support the 
development of these important ideas, such as proof, making and 
justifying conjectures, and using models and varied 
representations, are considered essential ideas. 

• Problem Solving and Reasoning. Assignments that required 
problem solving or reasoning asked students to formulate problems 
from situations, make generalizations, judge the validity of 
arguments, make models, and construct valid arguments and 
proofs. These go beyond assignments that require students to 
retrieve or reproduce fragments of knowledge or simply apply 
previously learned algorithms or procedures. 

• Effective Communication about Mathematics. Scorers examined 
the extent to which assignments explicitly called for 
communication of mathematical understanding. Assignments that 
called for communication asked students not only to “show their 
work” (i.e., provide a trace of the solution path) but also to 
“explain or justify,” providing insight into the clarity of the 
students’ mathematical understanding. 

• Relevant Context and Real-World Connections. Scorers looked for 
evidence of the extent to which assignments asked students to 
address mathematical questions, issues, or problems similar to ones 



19 




encountered in the experience of mathematicians and other 
professionals who use mathematics to solve problems. In addition, 
this rubric examined the extent to which assignments specified an 
“authentic audience” for student work products. 

• Student Involvement in Crafting the Assignments. Scorers 
examined the extent to which assignments allowed students to 
decide which topics they would investigate and which problems 
they would tackle. This rubric also examined the extent to which 
assignments gave students guidance in making choices about 
topics and problems that met their instructional goals. 

Mathematics Student Work 

The scoring rubrics for student work in mathematics examined four 
characteristics: 

• Conceptual Understanding. Scorers examined student work for the 
degree to which it demonstrated conceptual understanding related 
to one or more important ideas in mathematics. Students 
demonstrated conceptual understanding when they provided 
evidence that they could represent and classify mathematical 
entities; recognize, label, and generate examples and non-examples 
of concepts; use and interrelate models, diagrams, manipulatives, 
and varied representations; and identify and apply mathematical 
principles. 

• Procedural Knowledge. Scorers also examined the extent to which 
students demonstrated procedural knowledge of mathematical 
content, including knowledge of the key skills and processes in 
lOth-grade mathematics. Students demonstrated procedural 
knowledge by selecting and correctly applying appropriate 
procedures, verifying or justifying the correctness of a procedure 
using concrete models or symbolic methods, or extending or 
modifying procedures to deal with specific factors in problems. 

• Problem Solving and Reasoning. This student work rubric 
examined the extent to which students demonstrated skill and 
understanding in problem solving and reasoning. Student work that 
demonstrated problem solving included problem descriptions, 
determinations of desired outcomes, generation of appropriate 
models, selection of possible solutions, solution strategy 
alternatives, testing of trial solutions, evaluation of outcomes, and 
any needed revisions of solution steps and strategies. Student work 
that demonstrated mathematical reasoning involved evidence of 
logical, systematic thinking. This included intuitive, deductive, or 
inductive reasoning in making and justifying conjectures and 



20 




solving problems. Reasoning often involved hypothesizing, 
predicting, analyzing, generalizing, synthesizing, or proving. 

• Effective Communication. Scorers also examined the extent to 
which students demonstrated organized and consolidated 
mathematical thinking through written and oral communication; 
they looked for coherent and clear communication of mathematical 
thinking to peers, teachers, and others and for the correct use of 
mathematical notation and terminology. 

Teacher Feedback 

Additionally and importantly, for both English/language arts and 
mathematics student work, scorers looked for evidence of the provision of 
teacher feedback that would support student learning and better their work 
in the future. Starting with the thesis that teacher feedback can help 
students learn and improve their work (Bransford, Brown, & Cocking, 
1999), scorers examined the amount and nature of teacher feedback on the 
student work samples: 

• Informative Feedback. Scorers examined student work for the 
extent to which written feedback was provided, suggestions were 
made for the kinds of things students could do to strengthen the 
work, and guidance was provided on the application of the 
feedback to future work. 

Examples of teachers’ feedback are shown in Figures 2.2, 2.4, and 2.7 in 
this chapter and in Figures 3.3, 3.4, 3.11, and 3.12 in the next chapter. The 
reader will note that for some of these artifacts, teacher feedback merely 
remedies mechanical errors and content or comments on the quality of the 
work but does not say how to improve it. In other cases, teacher feedback 
provides information or a concept the student can use to refine the current 
work. None provide guidance for producing better work in the future. 

Conducting the Scoring 

As noted above, we hired 12 experienced teachers in English/language arts 
and 12 experienced teachers in mathematics to participate in scorer 
training and then score assignments and student work in their subjects 
during the summer. The scoring sessions were led by experts in the 
Authentic Intellectual Achievement framework and in the tenets of the Bill 
& Melinda Gates Foundation high school reform initiative. In each 
subject, teachers worked for a week to master the scoring rubrics and 
apply them to the Washington State assignments and student work. 



21 




To control for potential scorer bias-assignments and work were randomly 
assigned to scorers for each rubric .“"Each English/language arts 
assignment was scored on the four assignment rubrics just described; 
student products were scored on three student work rubrics and one 
teacher feedback rubric. Mathematics assignments were scored on five 
assignment rubrics; student products in mathematics were scored on four 
work rubrics and one feedback rubric. 

As a check on the reliability of scoring, all English/language arts and 
mathematics assignments were randomly assigned to a second teacher for 
a second scoring. Half of the student products in English/language arts 
were double-scored, and 40% of the mathematics student work was 
double- scored. 

The dataset that was generated by the scorers in summer 2003 and the 
analyses they supported are discussed in the next chapter. 



5 See the Technical Appendix of this report for details on our paper assignment and 
scoring procedures. 



22 




Chapter 3: The Quality of Our Measures 



In this chapter, we examine the procedures used to score assignments, 
student work, and teacher feedback. We begin with an examination of the 
reasonableness of the scoring data. We investigate whether the scores 
assigned to assignments and student work are likely to reflect important 
differences in the rigor and quality of the assignments, in the quality of 
students’ efforts, and in the utility of feedback. This chapter takes a first 
step toward answering questions about the characteristics of the scoring 
data and the likely value of the information they provide. 

We can think about the characteristics of our data in two ways. First, we 
can think about whether the different sets of scoring data relate to each 
other in sensible ways. Second, we can compare the scoring data with 
other things we know about teaching and learning in participating schools 
to see if the relationships make sense. In this project, we will do both. In 
this report, we will look at the different sets of scoring data to see if they 
make sense. Our next report, due in April 2004, will delve deeper into the 
relationships among different scoring data and it will relate the data to 
other information on teaching and learning in the Washington schools. 

Scoring Assignments and Student Work 

Scorers at our 2003 scoring session examined 177 English/language arts 
assignments and 399 student responses. In mathematics, scorers rated 184 
assignments and 425 pieces of student work. As we mentioned in the 
previous chapter, there were 8 different English/language arts scoring 
rubrics and 10 mathematics rubrics. Some of the scoring rubrics had 3- 
point scales, others had 4 or 5 points, and one had a 6-point scale. Again, 
all the assignments in English/language arts and mathematics were 
double- scored, and half of the student work products in English/language 
arts and 40% in mathematics were scored twice. We gathered these second 
scores in order to make judgments about the consistency of the ratings. 

Reliability of Scoring 

We examined the assignments and work that were double-scored and 
counted the number for which scorer pairs were in perfect agreement, the 
number for which raters’ scores differed by one point, and the number for 
which scores differed by more than one point. For the English/language 



6 See the Technical Appendix of this report for details on the rubric scales. 

7 See the Technical Appendix of this report for details on scoring and scoring reliability. 



23 




arts rubrics, scorer pairs were in perfect agreement on their ratings on 
between 60% and 69% of the papers on the eight different 
English/language arts rubrics. Drhe scores they assigned were the same or 
differed by no more than one point for between 79% and 96% of the 
papers on the different English/language arts rubrics. We consider these 
agreement rates to be acceptable; they are typical of agreement rates for 
performance assessment scorings with rubrics similar to ours, and they are 
similar to agreement rates calculated by the Chicago researchers. 

Agreement rates on the 10 mathematics rubrics were slightly lower. 

Perfect agreement rates ranged from 44% to 92% of the papers on the 
different mathematics rubrics. Agreement rates on the different 
mathematics rubrics increased to between 79% and 100% of the papers 
when assignments and work with scores differing by one point were added 
to the calculation. 

Although we consider these agreement rates acceptable, we plan to 
continue examining our scoring rubrics, training materials, and scoring 
procedures to see if we can strengthen our efforts and improve agreement 
rates as the study continues. 

Scoring Data in English/Language Arts 

After examining the reliability of our scoring data, we used a test analysis 
model called the Many-Facet Rasch Model to combine data across the 
different scoring rubrics and teacher scorers so that we could use a single 
score to characterize the rigor and authenticity of each English/language 
arts assignment. This combined score ranged from 0 to 10. We did the 
same for each piece of student work in English/language arts — that is, we 
created a single score to represent the quality of each student product. Like 
thcjissignmcnt scores, the combined student work scores ranged from 0 to 
10. We followed_these same procedures for mathematics assignments 
and student work. 



8 See the Technical Appendix of this report for the agreement rates on individual rubrics 
and for other data on the reliability of the scoring. 

9 See the Technical Appendix of this report for detail on the Many-Facet Rasch Model 
and for the results of the modeling process. 

10 Because the Rasch analyses used to produce single scores for assignments and student 
work products in each of the subject areas were conducted independently (and because 
the scoring rubrics differ across subject areas and for assignments and student work 
products), the resulting scales are not comparable. Thus, similar scores on the 0-10 scales 
hold different meanings, so a score of 3 on one scale does not necessarily hold the same 
meaning as a score of 3 on another scale. 



24 




Because there was only one feedback rubric in English/language arts and 
one in mathematics, it was not necessary to create a combined scale for the 
teacher feedback scores. Feedback scores are reported here on the same 1 
to 4 scale on which they were originally assigned. 

There are three ways we can make judgments about the reasonableness of 
the combined scores for assignments and student work. We will talk about 
all three of them next and provide data to help the reader examine our 
judgments about the data. 

• The first way to determine if the combined scores make sense is to 
examine assignments and work that got low scores in the summer 
scoring and those that got high scores to see if these 
characterizations are believable. 

• The second way to think about the reasonableness of the data is to 
examine the distributions of assignment and student work scores to 
see if the distributions have the expected properties — that is, that 
scores are approximately normally distributed without unusual dips 
or spikes in the displays. 

• The third way to judge the reasonableness of the data is to compare 
the scores for assignments that teachers described as typical of 
students’ day-to-day activities with those for assignments 
described as challenging for students. In general, we would expect 
the typical assignments and the student work that went with them 
to get lower scores than the challenging assignments and resulting 
work. 

In this section of the report, we examine the English/language arts results 
using all three approaches. We look at sample assignments and student 
work, examine score distributions, and compare score data for 
assignments described as typical and challenging and at scores for the 
associated work. 

Examples of Low- and Hiqh-Scorinq Assignments and Student Work in 
Enqlish/Lanquaqe Arts 

Figure 3.1 provides an example of a low-scoring English/language arts 
assignment along with a low-scoring student response. 



25 




Figure 3.1: Low-Scoring Assignment and Low-Scoring Student Work in 
English/Language Arts 



l-il rraiy " 



ilnm# a. - ■ 

FiU-ln l ii c hi il'* H' nun'll rcrh ] 

■'Col g-al ! L'[Fjurt T | ' •' r ~- V~ 

V I- arc i'm-J 1 wf-J ■ l fcKrr 



(.i . . ■ \ 'ij.i- 




3 * h KVOfflpI* ,piH=i™iti¥pocla wuuld be ~h ■* I ~V | , — i ^ ‘ ^ ■ ( 

s.nvlc uf«* !+■< n.| -lrt4 ■/ &- _0.- -U - turn Ifing-i 

Ih-J- k-. :..||| I not 1 u rH 

b rt icl i Mil I IE lliel“ “ -1 f ' 1_C And — -~T ' , ~ lr - 



TrLrt V Ffils* 1 2 pnio fiLJatiil 

l- 'find ifdisurt a:e1lHiJ Id me" ia BA a^or^la “T 
<f ^*^*na1e1loi J l II O [JirHa™7y did miliar, uf It Mrd 

J Ar. J 1 re a dal i II £ llOTO- wri'int nccmid lift lr---irf«“rf 

4 All Idlmn hill □ klcrtil lifid flguruptlv* mctiniiij 
9 A pu-i i* ItWH.ng a no.ii will' ■'■X'dS. 



G)™ f 

T 

T irrCj^ 
Qi «r F 
T !?<$! 



W,-,t,-hiiH [ 3 pel i'i s ™tki trieh idler if JMd enjy nn fw “- 

L "V- 1, When the cudliitf.e t-naWT whirl u -ge-ra] lu hvjppmn errj Iht £hOrOCl#r Uctcf mill 
i ^Tp.Wh<rrt1hc ncsunortw nf wlini if erjWC+»*£ Hepjit aj 

3 Whim 11^ reiiaWO'' cal' «hnr«. lhe TlVPiightF of “re U-eiTTflCier 

-1 A~ An rvcmirrl at lilt HM'lMCn M< 

£ C midilrfilrkkT. ■naopeoalinna. ettiIII iDFd tlinf a •wOrd r+* 



p: *kP tb'«a! r hn tF l f 

^jSnunii&iul irony 



^■£-~ ClllV-il-al- inn 

L'l-uimll-t r™o)r 



_^-r1iiirtl pdriWipainl 
of view 



The reader will note that to complete this assignment students are not 
expected to go beyond reproduction of knowledge; students need only 
demonstrate understanding of literary terms. The assignment does not 
require extended writing and the work that results is unlikely to have an 
audience beyond the teacher as grader. 



The student work that is shown in Figure 3.1 is fairly successful but, 
again, the assignment does not call for an original or elaborated response 
or for demonstration of complex understanding. Modest scores seem 
appropriate for this assignment and student work sample. 



26 



Figure 3.2 provides an example of an assignment that received high 
ratings from our scorers. 

Figure 3.2: High-Scoring Assignment in English/Language Arts 

Reflective Essay 

After reading the book Night , by Elie Wiesel, study the reflective 
essays included in your packets. Prepare to discuss the characteristics 
of reflective essays, including the use of dialogue, description, inner 
monologue, and conclusions that make “significant” statements and 
resolve “internal conflicts”. Afterward, draft your own reflective essay. 
Remember to pre-write, draft, and review. I will give you feedback on 
your draft and a grading rubric for the final essay. 

This assignment calls for extended writing and demonstration of the tone, 
style, and conventions of reflective essays. To complete this assignment 
well, students need to move beyond reproduction of knowledge and 
explore new ideas. Students are asked to choose a reflection topic and 
demonstrate their analysis and interpretation skills. Students are 
encouraged to refine their work as they complete the assignment’s 
successive parts and on the basis of teacher feedback. A high score seems 
sensible for this assignment. 

Figure 3.3 provides an example of a high-scoring student response to this 
assignment. 



27 



Figure 3.3: High-Scoring Student Work in English/Language Arts 



love ^ou HI f)U Up around mint o'clock," mom said a* she 
Iwned in for a fcfcss. A* I pujfed iw*j f replied, "Yeah, bp momf That was the 
Last conversation I thought I Iw&Yver h*T with m-f mom. 



ll was a ‘Wednesday Lite Start* morning before school, mom said 
j»dbp before leaving for wort around, 7:45. She was amusing me that 
morning. For no reason at all L was frustrated hj m^ mom. What had she don*? 
She had done fitting wrong, and still 1 didn't feel a tail of guill for taking to 
her tfiat w^ptffef should I feel bad? til see htr at nine o'clock tnf&ff I 
thought as I finished getting read'j. 



‘SLT 



Mine o'clock rolled around and there was no sign of ntj mom. 1 didn’t 
think anting of it 111 give her a few ertra minutes. Quickly the clock read £10 
and iftj mom. was still not home, t decided lo give her a l ?\\ at wort. ^ouVe 

reached the voice mail of ^ , said the machine. 'Well, she's 

on her waf, 1 thought as t dialed her cellphone number. "H&| you've reached the 

cell phone..." vJhert was she? , . 

tt was £§5 and 1 could picture tht sound of the high-pitched first, bell 
ringing in npj head, t tried both phones again, There was no answer. 'Scion. | 
could picture the sound of the second bell. I tried both phones again, There was 
fifll no answer, 1:3 d L Knew something had to be "rOUg., &o I decided to wait 
the whole “? blocks lo school. 



Ae E waited. I started hawing troubling thoughts, Thoughts of ttiorfi 
being in a bad car accident. I imagined her Ipng on a stretcher unconscious. I 
tried telling myself that l was too paranoid. As L was hiving all these thoughts. I 
had jjjst rounded the comer cm to* btrwt. The moment 1 looted up m^ heart 
stopped,, td^f whole bod^ went into a cold sweat and t got chills down back. Ml 
around, in front of the high school, were police cars, fire trucks, and Ihe 
dreaded ambulance. L fell like crping. I felt tike going home and turning the 
clocks back to T#>h*ri m-f mom had Leaned in to Kiss me, t thought of our 
conversation we last had. tiov could l have treated her so rudelj for no reason? 
I felt so bad inside. I just wanted to lie down and go to sleep, Ma^e if t woke up 
again it would all be jus! a bad dream. 



28 



Figure 3.3: High-Scoring Student Work in English/Language Arts ( concluded ) 



I waited slowly m Jhinfcing of the relahon^ntp mj mom add l shared, t 
was headed straight for all the commotion. Then I saw mf dad. He was standing 
*n the sidewalk in iHe middle iff all the chaos. t watched him as he starred at the 
groM His esf re&swn mas tm l will never forget. This is real' 1 thought, The 
reason m<\ mom had not pitted me up at m o J fW had jus! been uniformed. 

T £an believe it?* dad said with a smirk as be nudgtd arm. I 
looted up at him and started to cr|, ‘What is the matter, rthat happened?* 
- 7 iM me. ‘What Kind of question is that dad? flhif* going mT 1 relied 
with frustration. He ftwt t down fof side and gained to me that a student set 

one of the school bathrooms on fire. That was what caused all the excitement, 
Thais it?" t netted. That was the fat news 1 had ever heard in ntf life! Mhr 
saving goodbye to t»f dad, I waited into the school and headed for mj tram 
office to tell her how much t loved her. 

VJhai mates us so frustrated when all there is r is love? Wh^ do people 
have built up anger in them for no reason? t will never know Hit answer to tfiese 
questions but t do know that 1 want to change this within mplf, I readied that l 
should live each daj expressing m] true feeling, from that daj on, ! decided to 
never complete a da^ without sharing at least five great feelings l was having I 
made the decision to see the best in people and love each moment I have, k for 
mj mom, t always wonder about her when we aren't together, frit there's a peace 
in mij wonderment, L newer have to worrj about regretting wha^J last said to her, 
because I know the last words were the greatest words of a|Ln! love jouf*- 



29 



The student work in Figure 3.3 includes extended writing that makes a 
point and then supports it with evidence. The essay is rich with detail and 
illustration. The writing demonstrates good analysis skills and competent 
command of language conventions and resources. Again, an upper-level 
score for this writing sample seems appropriate. 

We end our discussion of the plausibility of the English/language arts 
scores with an example of teacher feedback. The student work in Figure 
3.4 responds to an assignment on social and political issues. The 
assignment introduces several issues and then points students to relevant 
written and video materials on the issues. The assignment asks students to 
choose from among a set of topic statements and begin writing. The 
student work in Figure 3.4 includes teacher feedback. 



30 




Figure 3.4: Teacher Feedback in English/Language Arts 



55 u pit'll 

ji’iLt l( vJCU vj 1 Ji Vv^vun 

t s uL'i C -jLchk'fctogy, su'w 

beSidiSj unci ta&ft i \ .avi -iy a ^ [\ 
i rat \ il ', p -3 bp Vu.'-y 'w i in i » «fc ' u y ■ • " ■', ' 



-DV 



_i | 

4 ££Vmbc^ wUi '■ 1 

jurat 'rtiLp. ^tap vcl't wdh LrflA.- K'*jr 
Vi i- i_ i I L^/l - H:iVlrr^^uvCh 

f^'^Sr w-ai l>i-|Ua icC^a ^jKhL^-JC 

Li Zi± Wt CHt PtHl Vi'Ai i C>th£l - .-7 

iVCx.-iTi'V [&. Yt WvUk Lf, Li t£v rvi>rt % P 1 ^" 

L'V.Ac^ • fl, io Ct-t i rh si . Mo 

b^rpEc c Uunti x\p .S&l ILL V£, jj-rtj.Y j 

ifcC-L^ i ct.uvs'tr i ni/d-i , 1 ; v-i i , \ 

Led a t. 

^Ll-QTcif peiiO£>..mii cjinlyi V yLp . 7 
ar&U Oxt'rfi, : wt jjpoiSfct ' 

Li l J| Cr VDLSt U^dl I* rL-t mi{ bt" 

L'Jhai p-ilCrfS wl iiOMt i-f tteu 
V * Hit ie , U f U ' Cd 1x\ct L ■£ . 



ttu»u ici i • 'i a_. ^ ,\\\[a\ vi ■- \ %/■ ., .j, 4 , ’ 
OiAv\ odt iY!CT& P£OV^& rn CfcM^tr. 
to iiTDCti L ^ 0 #_C ditL, Lint 
.pujjic {'< tfct ilyi’-t qet fcijled ord ,,ui^ 
j ffi ~’to" \.o A 5 ^~- 

•nrn r>t^ w&X- 

i'i^ \% ti \wi^ 



■t«-n: 3 l 4 i i-fT-i-.i-r,^ 



•'aitWi; K..« -MJCiY v-;i'Vl '■ ‘hf'Pr. 
Ij i'jQ lV, iiA\r YJClf' * ii-L ■- + 
pCCJ^C. VvlU d'it i.or& -V .1 KjtO’.t 
sifiAl lo; fcfl ,bm£d , t 



31 



This feedback provides information the student can use to revise or 
improve his or her essay. The feedback is specific to this work, though, 
and does not provide guidance on the application of the feedback to future 
work. This submission got a score of 3 on the 4-point teacher feedback 
rubric in English/language arts. 

Score Distributions for Assignments and Student Work in 
Enqlish/Lanquaqe Arts 

Next, we examine the distributions of scores for assignments and student 
work, looking for any irregularities that signal potential problems with the 
rubrics and their implementation. Figures 3.5 through 3.7 display score 
distributions for assignment, student work, and feedback data in 
English/language arts. Figure 3.5 displays scores for English/language 
arts assignments. 

Figure 3.5: Score Distribution for Assignments in English/Language Arts 




Score 

The reader will note that the English/language arts assignment scores are 
fairly evenly distributed across all of the score intervals. However, there 
are more scores in the top half of the distribution than in the bottom half. 
Because we hope to use these scales to chronicle change in teaching over 
time and because top scores better represent the foundation’s instructional 
intentions, it would be preferable to have scores cluster initially in the 
bottom half of the distribution. Documenting positive change would be 
easier with more room at the top of the score scale. As we move forward 



32 



in our work, we will look for opportunities to draw finer distinctions 
among assignments scoring in the top half of the English/language arts 
assignment distribution. E3 

Figure 3.6 shows the distribution of scores for student work in 
English/language arts. 



Figure 3.6: Score Distribution for Student Work in English/Language Arts 

45 - 




> 9-10 



Here again, we see a clustering of scores above the midpoint of the 
distribution and would prefer to see more scores in the bottom half of the 
distribution. As we prepare for next summer, we will examine the score 
points with very little student work data and look for opportunities to draw 
finer distinctions between student work that is now clustered between 
scores of 6 and 8. 

Figure 3.7 shows the distribution of scores for teacher feedback in 
English/language arts. Again, these data are reported on the scoring rubric 
scale, with the bottom score assigned to papers with no written feedback 
and the top score given to papers with feedback that informed 
improvements to the student’s current work and that provided guidance for 
strengthening future products. 



1 1 The pros and cons of making refinements to the rubrics or scoring procedures are 
detailed in Chapter 5. 



33 



Figure 3.7: Score Distribution for Teacher Feedback in English/Language 
Arts 




This display shows that approximately 40% of English/language arts 
papers received a score of 1 ; these papers included no written feedback. 
Less than 10% of the work received a score of 4; these papers included 
feedback about possible improvements to the current and future work. 
These data on the English/language arts feedback rubric indicate that there 
is ample room for improvement of teacher practice. 

Scores for Typical and Challenging Assignments and the Resulting 
Student Work in Enqlish/Lanquaqe Arts 

Figure 3.8 takes the English/language arts assignment data shown above 
and displays the scores separately for typical and challenging assignments. 
In Figure 3.8 typical assignment scores are shown on the left of each pair 
of bars, while scores for challenging assignments appear on the right in 
each pair of bars. 



34 



Figure 3.8: Score Distribution for Typical and Challenging Assignments 
in English/Language Arts 




Score 



□ Typical □ Challenging 



This graph shows that more of the typical assignments in English/language 
arts scored low on rigor and authenticity, and more of the challenging 
assignments scored high on this metric. This pattern of results is the one 
we hypothesized and is similar to patterns that appear in the Chicago data. 

Figure 3.9 provides similar results for student work in English/language 
arts. 



35 



Figure 3.9: Score Distribution for Student Work on Typical and 
Challenging Assignments in English/Language Arts 



45 

40 

35 

30 

25 

20 

15 

10 

5 

0 



ia , □ 



Q 



































— 



0-1 



> 1-2 



>2-3 



>3-4 



>4-5 >5-6 

Score 



>6-7 



>7-8 



>8-9 >9-1 0 



□ Typical □ Challenging 



These data, too, show the expected pattern. In general, student work 
produced in response to challenging assignments received higher ratings 
than work produced for typical assignments. The fact that more of the 
student work on challenging assignments received high scores shows that 
students’ responses to challenging assignments were more complex than 
their responses to typical assignments. 



Scoring Data in Mathematics 

This section of the chapter reports the comparable set of analyses for 
mathematics assignments and student work. The text and exhibits in this 
section of the report discuss the reasonableness of the scoring data by 
showing low- and high-scoring artifacts, the distributions of scores for 
mathematics assignments and student work, and score data for typical and 
challenging assignments and the associated student work. 

Examples of Low- and Hiqh-Scorinq Assignments and Student Work in 
Mathematics 

Figure 3.10 provides an example of an assignment that scored low on the 
combined scale for mathematics assignments. 



36 



Figure 3.10: Low-Scoring Assignment and Low-Scoring Student Work in 
Mathematics 



Inverse Assignment 

Complete problems 7-12 on page 302 and 15-32 and 35-39 on page 303 of your 
textbook. Show your work and use graph paper as needed. 



1 an aquation lar Ihu intone til the rebUro S:* bel™ 

1 =*- + 5 1G- y - 2DE - 3 1 7. J,. - JT * 4 

Jf " Ilr - 6 20. F = - J + 6 n 



IS- y = -Sx 
22. y ■ - 



(• 9 

\* 



xtrtiiM 23-S0v ikelch the Function and Jti anweru in thn carin' twrdi- 
• pljne. li «if ansHTSf 4 lundliiDni of *7 Sell AddilintlPl fclSIHSn 

■'I * I “ * * 3 ^ 27. - -1 + a tes 75. fljfJ - a* 4- \ Yej 

n.rl - Jf 3 4 ] Nil 25. fly] - -r 1 + i Nil 29i f[w) - X* - 4 Nn 

31-37. ikelch thn graph a-F thn Function. UtE Iho gcflpH PF fla 
dc whuLhcr Out inwEPie af t h a function tsi n. frv toakinnl fimvzro 

-Hi J£ + 1 *e& 32. fix J - -lx 3 I Nil 13 fW - X } -l fe 



Mo 

2 fi fix} ly 3 - A 
30. y.x) - -t v 



37 M) = n.v| I Nn 




K i u.i), l * t -o ( i 1 j ( - / , e ) i 

e 2 (4,0^ , i . -M 1 ) , t ■ * 5 , t I , I > ? 

11 ! 1 %ki t ■ 14 1 ) . £ j „ 1 J 

hr - *» M 4 t f | £ ^ j “ j— j C “4. D J ■ l| m\ 




u r 



, a 



37 



Figure 3.10: Low-Scoring Assignment and Low-Scoring Student Work in 
Mathematics ( concluded ) 




To complete this assignment well, students have to invert a function and 
construct two-dimensional graphs of the function and its inverse. Though 
functions are among the important mathematical ideas that 10th graders 
encounter, this assignment requires students to demonstrate little or no 
conceptual understanding of functions, and the assignment itself is only 
tangentially related to the topic. In addition, the assignment requires no 
demonstration of problem solving or reasoning; it requires little more than 
a numerical solution or graph with no explanation for how the solution 
was reached. This assignment makes no attempt to create a problem 
situation that reflects the use of functions in a real-world application. 



38 



Though the student work that is featured in Figure 3.10 is competent, the 
work itself offers little evidence of complex understanding. It seems 
reasonable that scorers gave modest scores to this assignment and this 
student response. 

Figure 3.11 provides an example of a high-scoring assignment and student 
response. 

Figure 3.11: High-Scoring Assignment and High-Scoring Student Work in 
Mathematics 



Solve the problem, show your calculations, and provide a detailed 
description of your solution process. 



En the Ntivt-ibo 3030 i»jc cut Ifaj QUBnuacMagozirtt lita truck riotf ti-iiidjcg ■ 
varied ceiling I" v. h : h num curved) ir. i ncHrrom ALIhe end c! Uk vadiied it- 
frautf be ici "eyebrow w, Allow", wiueli Li a window bin’s coved ki lop wilt * itraighi 
bnre :r M:i£ cinderr. tvesrsw windnvri trtjjii a ll'icc u"I n elide. 



Hn-c's the eweh. they -didn’iIcKow how wide thowfenkrw wa*' fta'i what itey uid It* 
themegismc 

Idle CDnlraelenl needed fo nike ihf ftirvt oflhs BoJUpfl joisti mulct 
aflbr eyttiruw wiedew When ihe uml diilo'l a/rhc In lime, he WH forced 
io re‘iy on die rTianidnenjrte's.pniniiMihJt ihe ihape would be equivalent lo 
A 5 5-Lndi-decp Slice oGfihe ! 0 p d R circle wjdi qnS 5 foci radius 

Ycur Job Lc to figure out how wide the base of this window wLl] tic. Us: 
yojf code tiLii 43 x 1 a compiiE ar-d nmjt ed^e Wnie ynun salmiem as 4 MMW Yoo 
dc pat need Id ward praceis. The rtiiy bfl you ll'OllId be ski: nr.il deu in ]nu 
i- Drescntaiion cod include ell parts nf the MMW 



Vaulted Ceiling Assignment 



The Vaulted* Ceiling Problem 








39 



Figure 3.11: High-Scoring Assignment and High-Scoring Student Work in 
Mathematics ( concluded ) 






’i^Su 




•*4*4 -j: hiy\ ChaJ!. p £* rl cF ^ i!«J« Hsiki Uj-j 

^5 f I- . £w S 6t i'4&4 e-rti!? 

Z-kf*r '>i yevJ - 

'r'ro l j ^ 'iu. ^Ltiii. - r.r+ i can*Uc.l 

tSte iva* 1 



yv 




\ 



or-e L-rv^ ^ r'«Jlj4Xn**4j 1 cj>bk .,-^k,S 
€5 i AC AjlS- i -3 ^ Q ' *~k Ciftife 

e/«WOvj Zc**r4 -ro^- '4*^ y&*. 

f - <Irde ^■rt— 1 ^ cKc-ct ,-i . 1 ■ tA j i ; i k, 

I j-. - -* r ■ '*-■ 

'jo-IC c Hj-tiSrO l*o -jMxf-tnw. Tk ^.^n l'fr 5 /rcAf 

■5a .iViB-rt n r '^iaa 4£.V 4^.4 '5 irxJp Vp«- 4*\^ 

^fi-nkr i r >vJe. .-v^ -(V S. rc-i^'L.sS, Tfva J 

44e ssrv'f, rt ,sl ii I'O^jjrtiaa* K ,\,_^j +k ta*H01'iO ^ 

' r£jV‘| rt^wJ c-fY^ yC.^ Jrf_ j l 4l of l-kj 

r\£ 44* £ 7. 'htk5.. Nekt, cpnufcj 

e-Pc-M^tr rsniiLM 4o t kur tnd? .4" -Wc. c^rS-* * 

“it-.-j i 1 m. i^ !OP I'rc/jiS --'tti . Jbjk kj. 6&jrf L . 
qT 44-^ u q*<_ 3,'i.ii i>V k tvAiv'e , y ^ 

To L r, J J l-^C. ifii-niMi of 4 , VC Lx. -UjyC Ml 

So ^ 

5is 4, -,10,(4#. , tf* 3S.3^ 6}.^' 

lV,ii -.j jSfcSS kii <£ H-* if U m i^.Ad?g^. 

S3*SifJtei &y 3 t^re£ yau^ a*4 Ms isl&J 
j-h , tU Ui eytkai-i i|i 

kX -W m. sr ^ K+. _ 

fa^'tec^oin - X rt^l'iy rt C-Aji 1 *b '■""^ i Lr^J ■ 

iri. i^jL _M t_ 

prefiiWi. n J kikiLj’ kr Ut i^ r &d.t/ i*u 
^ ,. t U 1 1 fSiJ T 

I I ai*r 



■ 



■\ 

( j-A # % is ^ tM W'^k J* 1 '" 1 ' ™j[u 

f4 £>t*£. 4-J cj- C^. C.JV:!* I Sa |Ai& CnlA ^Cs p- ftcar^K- 



c l&'' Qj¥ 1 !“■■ I Sa J*U. 



40 



To complete the assignment in Figure 3.11 successfully, students have to 
demonstrate their understanding of the geometric facts and theorems 
related to circles and to argue clearly for their solutions. This assignment 
addresses a problem that is authentic and reflects the types of 
mathematical questions that are encountered by people in the real world. 
Although demanding only a moderate to low level of conceptual 
understanding of the domain, the assignment does require students to 
demonstrate a fairly high level of problem solving and reasoning within a 
problem setting that is likely to be relatively unfamiliar to most students. 
In addition, the assignment requires students to show their solution path 
and provide a detailed explanation and justification for the work. 

The student work that is featured in Figure 3.11 demonstrates clear 
conceptual understanding and procedural knowledge related to the 
relevant facts and theorems and is free of misconceptions and procedural 
mistakes. The problem-solving strategies and reasoning are appropriate 
and lead to the successful completion of the problem. The student work 
includes a solution path with complete and accurate explanation and 
justification of the conclusions. It is easy to appreciate the high marks 
scorers gave to this assignment and student response. 



41 




Figure 3.12 provides an example of teacher feedback in mathematics. 



Figure 3.12: Teacher Feedback in Mathematics 



Surface Area Packaging Assignment 

Choose a consumer product and calculate the surface and lateral areas of 
packaging material for it. Create a net diagram and answer questions about 
area prisms. 

iUii .u t. ARt.v- 



VMterfCmmim. a Lf t • . I t r - ;j f r /p t 




31 Fmd patterns b> whictl tt* 4ren at the mniHttl mceetds thf 

sica at lUtf .wnUinET 



1 1 W ny h:iuU ■ umiiUanLira br ftincrrned ah.ul llw wrf»“ m M of » pnc'k.nii'i ■ 
AhiMlllki! JllUMII OF nulHiii used |ii miii: It'S 1***^ , 

Se* -uJ' cnA ±t jC * 1 ZswlJ fasted \p_ ' 

s6ht> 6 <7Xft± «&***£$ 1 

^ 1Ato rnuM j> uA 

cc?j 1-it.nlspn. ikt yxdk *» 



42 



Like the scores for teacher feedback in English/language arts, the scores 
for feedback on mathematics student work ranged from 1 to 4, with scores 
of 1 going to artifacts with no written feedback and top scores going to 
work with informative feedback. The feedback in Figure 3.12 got a score 
of 3; it provides information the student can use to improve this but not 
future work. 

Score Distributions for Assignments and Student Work in Mathematics 

Figures 3.13 and 3.14 show how scores are distributed on the combined 
score scale for mathematics assignments and student work. Figure 3.15 
displays data on the feedback metric in mathematics. 

Figure 3.13: Score Distribution for Mathematics Assignments 




Score 

The combined scale data for mathematics assignments in Figure 3.13 are 
fairly evenly distributed across the score points. There is a slight skewness 
to the distribution, with a higher percentage of assignments getting scores 
of 5 or above than getting scores of 4 and below. Nonetheless, there is 
sufficient room to measure change in teacher practice as schools make 
progress. 



43 



Figure 3.14 shows the data for student work in mathematics. 



Figure 3.14: Score Distribution for Student Work in Mathematics 




Score 

The combined scale data for student work in mathematics is highly 
skewed, with approximately 80% of the work with scores below 4. It 
appears that the current rubrics for assessing student work in math provide 
ample opportunity to document positive change as participating schools 
convert to small learning communities. 



44 



Figure 3.15 displays scorer data on the feedback rubric in mathematics. 
Figure 3.15: Score Distribution for Teacher Feedback in Mathematics 



o 

CO 

n 

■o 

0 

o 



c 

0 

O 

(1> 

Q. 




Score 



This display shows that less than 20% of the work in mathematics had any 
written feedback, and less than 2% included feedback that provided 
guidance for refining the work. The data show that students in 
participating classes got very little written feedback to inform possible 
improvements to their work. None of the work included feedback that 
provided guidance for strengthening future work. 

Scores for Typical and Challenging Assignments and the Resulting 
Student Work in Mathematics 

Our final examination of the data in this report considers combined scale 
data for typical and challenging assignments in mathematics and for 
student work done in response to typical and challenging tasks. Figure 
3.16 shows the data for assignments, and Figure 3.17 gives the data for 
student work. 



45 



Figure 3.16: Score Distribution for Typical and Challenging Mathematics 
Assignments 



45 



40 



35 



</> 

S 30 

E 

c 

9 25 

</> 

</> 

< 

O 20 

c 

V 

“ 15 

ID 

CL 



10 

























- 










































1 1=1 


1 I 



0-1 >1-2 >2-3 >3-4 >4-5 >5-6 >6-7 >7-8 >8-9 >9-10 

Score 



□ Typical □ Challenging 



Like the English/language arts data for typical and challenging 
assignments, the data in Figure 3.16 show the expected pattern. On 
average, the scores given to mathematics assignments that teachers 
regarded as challenging are higher than the scores given to typical 
assignments. Most of the mathematics assignments with scores of 7 or 
higher are from the challenging group. 



46 



Figure 3.17: Score Distribution for Student Work on Typical and 
Challenging Assignments in Mathematics 



a— 

o 

o 

■*-> 

c 

< 1 > 

o 

o 

Q. 




0-1 > 1-2 > 2-3 > 3-4 > 4-5 > 5-6 > 6-7 > 7-8 > 8-9 > 9-10 



Score 



□ Typical □ Challenging 



Though considerably less prominent than the distribution pattern seen in 
Figure 3.16, the pattern of results for mathematics student work in Figure 
3.17 follows the same form, with more of the work associated with 
challenging assignments receiving higher scores than work responding to 
typical assignments. Very few pieces of mathematics work were scored 
higher than 6, and those that were are responses to challenging 
assignments. 

Relating These Data to Other Information on Teaching and 
Learning 

We continue to work with these and other data on teaching and learning in 
the Washington State schools. We are currently examining assignment and 
student work data in the context of other information about the 
characteristics of schools in this sample, the teachers who provided the 
assignments, and the students who supplied work. We are using complex 
modeling techniques to take these factors into account as we examine 
relationships between the rigor and authenticity of assignments, the quality 
of student work, and the utility of teacher feedback. 



47 



Importantly, we are relating the Washington data to achievement test data 
for participating students so we can discuss the relationships between 
assignments, the work students do in class, teacher feedback, and students’ 
standardized test performance. In the Chicago work, researchers found 
moderate relationships between student work scores and standardized test 
results. They also found that students scored higher on standardized tests 
in schools where teachers gave more rigorous, authentic assignments. We 
discuss our upcoming analyses on these questions in Chapter 5. 



48 




Chapter 4: Summary and Conclusions 



This chapter discusses the conclusions that we draw from our work thus 
far. It describes our work with English/language arts and mathematics 
teachers in foundation-supported schools in 2002-03 and with the 
assignments and work they provided. It describes the work of the teacher 
scorers and the data that resulted. 

We start our discussion with the teacher participants and the assignments 
and student work they submitted. 

• In 2002-03, teachers at foundation-supported schools in 
Washington State were willing to help us learn about teaching 
and learning in their schools. Forty-eight teachers provided 
samples of assignments at eight different times in the school year, 
along with samples of student work for three of those 
assignments. They also described their goals for instruction. 

• The project team developed systems for capturing and archiving 
the assignments and student work that teachers provided. These 
systems adequately supported database development and the 
scoring process. 

We draw several conclusions from our work with the rubrics and summer 
scoring. 

• Subject matter experts and experienced teachers in 
English/language arts and mathematics helped us adapt 
Chicago’s Authentic Intellectual Achievement framework to high 
school-level work and expand the measurement domain to 
include some of the unique goals of foundation-supported 
schools. We added rubrics to assess the choices students make 
about what they will study and how they will leam and the input 
and opportunity students are given to revise and improve their 
work. These constructs are important to teaching and learning in 
innovative high schools. 

• In this inaugural year, we created training processes, training 
materials, and scoring procedures for assignments and student 
work in English/language arts and mathematics. 

• With the help of 24 experienced teachers, we ran successful 
scoring sessions in both disciplines. Scorers gave ratings to 
assignments and student work in English/language arts and 
mathematics with agreement rates that are typical of performance 
assessment scorings with rubrics similar to ours and like those 
obtained by our colleagues in Chicago. Exact-agreement rates for 
a couple of the mathematics rubrics were lower than we would 
like, and we will concentrate on these in preparation for next 
year. We will examine our scoring and training materials for 



49 




these rubrics to see if we can strengthen the guidance we provide 
and increase agreement rates in 2004. 

• Scorers in both disciplines said the rubrics helped them make 
important statements about the rigor and authenticity of 
assignments, the quality of student work, and the utility of 
feedback. Scorers said that the training and scoring sessions 
provided them with powerful professional development. 

Finally, we draw conclusions from the analyses we have completed to 
date. 

• Using the Many-Facet Rasch Model, we combined data across 
the different assignment rubrics and teacher scorers to create 
single estimates of the rigor and authenticity of classroom tasks 
in English/language arts and in mathematics. Similarly, we 
combined data across student work rubrics and scorers to 
estimate the overall quality of student work in each discipline. 
These modeling procedures worked well. 

• Our examination of individual low- and high-scoring assignments 
and low- and high-scoring student work lends some credence to 
the results. Our post-hoc characterizations of low-scoring 
products are consistent with score-point descriptions at the lower 
ends of the scoring rubrics. Similarly, the qualities of high- 
scoring assignments and work are consonant with the meanings 
of upper-end scores. 

• The assignment and student work scores that resulted from our 
procedures have reasonable, but not optimal, distributions. As we 
move forward in our work, we will examine the English/language 
arts rubrics to make sure they provide room to document the 
future progress of reforming schools. The clustering of scores for 
English/language arts assignments and student work in the upper 
ends of the score scale is not ideal in that the current scale may 
not provide room for teachers and students to demonstrate 
increased rigor and quality in the future. We expect the scores to 
increase as schools move forward and implement more 
innovative instructional approaches. The clustering of the scores 
in the lower range of the score scale for student work in 
mathematics is more consistent with presumed practice at schools 
planning for conversion. 

• The distributions of scores for typical assignments and 
challenging assignments lend validity to the combined scores for 
assignments and for student work. On average, as hypothesized, 
more typical assignments have lower scores than challenging 
assignments. Similarly, on average the scores for student work 
responding to typical assignments are lower than scores for 
student work produced in response to challenging assignments. 



50 




This first year of research on teaching and learning in foundation- 
supported schools has yielded substantial methodological and 
measurement developments and provided promising results. Chapter 5 
discusses the work that remains with our 2002-03 data. It also describes 
new round of data collection in reforming schools in 2003-04. 




52 




Chapter 5: Next Steps 



As we enter the new year, we will continue and complete the first- year 
data analyses, collect 2003-04 assignments and student work in a new 
sample of schools across the country, and make preparations for the 2004 
scoring. For the 2002-03 data from Washington State, we will: 

• Create reporting scales that have meaning for teacher participants 
and other school-based reformers. 

• Examine the relationships among assignments, student work, 
feedback, and achievement test scores in Washington State. 

For the 2003-04 data, we will: 

• Collect assignments and work in 12 new small schools and 4 large 
schools planning for conversion. 

• Refine scoring rubrics and procedures for the 2004 summer 
scoring session. 

• Examine possibilities for studying non- written and other non- 
conventional work. 

We discuss these efforts in turn. 

Creating Meaningful Reporting Scales 

We plan to examine the scoring data just presented, raw data from 
individual scoring rubrics, some of the intermediate results of the Many- 
Facet Rasch Model, and assignment and student work artifacts to create 
reporting scales that can be easily interpreted by participating teachers and 
other reformers. We plan to follow the approach taken by the Chicago 
Consortium researchers. Newmann, Lopez, & Bryk, (1998) collapsed their 
combined scales into 4-point reporting scales that described extensive 
rigor and authenticity, moderate rigor and authenticity, minimal rigor and 
authenticity, and no rigor and authenticity. They created a similar scales 
describing the quality of student work. 

To create the 4-point scales for our data, we will need to iterate between 
scoring data and the artifacts to identify cut-points and score bands that 
support the inferences suggested by the four scale points. We will need to 
bring data analysts, subject matter experts, scoring data, and artifacts 
together to determine whether meaningful reporting scales can be created. 

We believe the descriptive 4-point score scales will be more meaningful 
than the current 0 to 10 scales. We hope these scales will be useful to 
participating schools as they examine their efforts and make plans to 
improve teaching and learning. More generally, we hope these reporting 



53 




scales will make the results of our work more meaningful to school-based 
reformers. 



Examining the Relationships among Assignments, Student 
Work, and Achievement Test Results 

Using data from the schools in Washington State, we will conduct 
correlational analyses to address the following research questions: 

• What are the relationships among course and student 
characteristics (such as course level and students’ reading levels) 
and the rigor and authenticity of English/language arts and 
mathematics assignments? 

• To what extent are challenging assignments associated with more 
complex student work in English/language arts and mathematics? 

• What are the relationships among results on jurisdiction-sponsored 
achievement tests and English/language arts and mathematics 
assignment and student work scores? 

We will use hierarchical linear modeling techniques to examine the 
relationships between classroom characteristics (e.g., teacher background, 
student composition) and the rigor and authenticity of assignments. We 
also will examine the relationship between assignment characteristics and 
the quality of student work to determine if more complex learning 
opportunities prompt higher-quality efforts by students. Finally, we will 
relate assignment scores, student work scores, feedback data, and 
jurisdiction-sponsored standardized test results to each other to see how 
results on conventional achievement tests compare with what we leam 
about teaching and learning from analyzing classroom assignments and 
work. To do this work, we will match the current assignments and student 
work with demographic data and achievement test scores for students in 
this dataset. The results of these analyses will be presented at the 
American Educational Research Association conference in April 2004. 

Collecting Assignments and Work Nationwide 

In the fall of 2003, we moved beyond Washington State and began 
collecting assignments and student work from a national sample of 
foundation-supported schools. ;Some of the schools in the national data 
collection have fairly innovative instructional programs, and we are 
beginning to think differently about the meaning of courses, assignments, 
and student work in reforming schools. 



12 See the Technical Appendix of this report for a description of the national sample. 



54 




For example, at some of the participating schools, there are not 
English/language arts and mathematics courses; courses are 
multidisciplinary or theme -based. At others, teachers do not give 
assignments. At Big Picture schools, for example, students work on 
internship-based, semester-long projects that culminate in student products 
and public exhibitions. Students select and write proposals for their own 
projects with guidance from teachers and mentors. The curriculum is 
tailored to the needs and interests of individual students. At other 
participating schools, student work results from several students’ effort. 
For instance, in New Technology Foundation schools, instruction is 
organized around class projects that require students to work in groups and 
produce group, rather than individual, products. 

In cases like these, the project team has worked closely with grantees and 
school principals to develop sensible data collection plans. The plans have 
been developed to be sensitive to the schools’ instructional programs 
while assuring that the data collected in these schools are compatible with 
data collected in other participating schools. 

Refining Scoring Rubrics and Procedures 

In preparation for the 2004 summer scoring, we hope to make 
improvements to our scoring rubrics and procedures, using the data in this 
report, our experience at the 2003 summer scoring session, and feedback 
provided by the scorers. Chief among these are improvements to the 
materials and processes that drive scorer agreement rates. We hope to 
obtain higher perfect- agreement rates in the upcoming scoring, 
particularly in mathematics. 

Also important, but more vexing, are changes to correct some of the skew 
in the combined score distributions. We would like to leave more room at 
the top of some of the English/language arts score scales to document 
positive changes in teaching and learning for schools that continue to 
reform. Uncovering the likely sources of distributional difficulty will take 
some detective work. We will need to examine the scoring rubrics, raw 
score distributions, and some of the Many-Facet Rasch Model estimates to 
determine if there are ways to expand the upper ends of the score scales. If 
we discover possibilities to do so, we then will need to come up with 
feasible adjustments or corrections (Some improvements may be too 
cumbersome or costly.). 



55 




As we consider the refinements, we will need to decide whether suggested 
changes are likely to compromise comparisons between 2002-03 and later 
data. Some refinements to rubrics and procedures may be minor enough 
that they do not affect cross-year comparisons. Others may change the 
score data enough that new data would not be comparable to old data. 
There is a tension, therefore, between efforts to improve current methods 
and materials and assuring comparability over time. At this writing, we do 
not have sufficient information to make decisions intelligently on this 
issue. 

Exploring Options for Studying Non-written Work 

As mentioned above, several schools in the 2003-04 data collection have 
innovative instructional models. Some of these schools, as a matter of 
practice, ask students to produce work with media (e.g., video, audio, 
computer animations) that do not lend themselves well to the scoring 
processes with which we are familiar. Despite the fact that more and more 
schools are moving to innovative products, procedures to reliably score 
student work produced in alternate formats are not well understood. 

Thus, this year, we will collect a sample of non-written and non- 
conventional work from participating schools so we can examine the 
feasibility of characterizing this work. We plan to share a sample of non- 
conventional work with scorers at the end of the 2004 scoring session and 
ask them to help us brainstorm about possible evaluation of the work. One 
of the central issues is whether these types of work provide information 
about student performance not currently captured by written work. We 
also need to be concerned with whether the unique information is useful 
and whether it is amenable to systematic evaluation within our framework. 

We look forward to our continuing work with the 2002-03 data and with 
teacher participants in the 2003-04 schools. The questions raised in this 
chapter are interesting ones and we are eager to address them. We invite 
readers of this report to contact us for clarification of the information 
provided here or for additional detail. We welcome any and all 
suggestions for improving our methods and work. 



56 




References 



American Institutes for Research and SRI International. (2003). High time 
for high school reform: Early findings from the evaluation of the 
National School District and Network Grants Program. Washington, 
DC: American Institutes for Research and SRI International. 

Bransford, J., Brown, A., & Cocking, R. (1999). How people learn: Brain, 
mind, experience, and school. Washington, DC: National Academy 
Press. 

Bryk, A. S., Nagaoka, J. K., & Newmann, F. M. (2000). Chicago 
classroom demands for authentic intellectual work. Chicago, IL: 
Consortium on Chicago Schools Research. 

Linacre, J. M. (1989a). Many-Facet Rasch Measurement. Chicago, IL: 
MESA Press. 

Linacre, J. M. (1989b). A user’s guide to FACETS: Rasch measurement 
computer program. Chicago, IL: MESA Press. 

Newmann, L. M., Bryk, A. S., & Nagaoka, J. K. (2001). Authentic 
intellectual work and standardized tests: Conflict or coexistence. 
Chicago, IL: Consortium on Chicago School Research. 

Newmann, L. M., Lopez, G., & Bryk, A. S. (1998). The quality of 
intellectual work in Chicago public schools: A baseline report. 
Chicago, IL: Consortium on Chicago Schools Research. 



57 




58 




Exploring Assignments, Student Work, and 
Teacher Feedback in Reforming High Schools: 
2002-03 Data from Washington State 



TECHNI CAL APPENDI X 




'ne c\ vm^i6o 

YpecauS^ -M <^\v3*yV \jCOf)\ 

fttnO do 'yto^ £vuaO '-fhl J> 

^ vAVlvnK \Vs> (X t d)K 0^ o- LA$\iE<20963( 
A YVW-i qoodc^uuis 

S> ^ Vhnn \0 \e\ 

VY\^ ^voe smbe^ . 

■VK \)$m> Vmn So b=cAxu \peoa'J&l 
Ye W(€bTO0M & V«S AY^tfVT tU 



A Report from the Evaluation of the Bill & Melinda Gates Foundation's 
National School District and Network Grants Program 

J anuary 2004 









mm AMERICAN INSTITUTES FOR RESEARCH 



SR/ International 




Technical Appendix 



A-l 




A-2 




Sampling and Data Collection for the 2002-03 Data 

This project was initially piloted in Washington State during the 2002-03 
school year. Eight assignments, four typical and four challenging, were 
collected from 48 teachers (24 English/language arts and 24 math) over 
the course of the school year. Student work associated with the teacher 
assignments was also collected three times during the year from a random 
sample of students predetermined by the researchers and blind to the 
teachers. 

The teacher assignments and student work were scored in summer 2003 
by using rubrics developed by the American Institutes for Research and 
SRI International, based on the rubrics used in the study of the Chicago 
Public Schools by Fred Newmann, Tony Bryk, and others (Newmann, 
Lopez, & Bryk, 1998; Bryk, Nagaoka, & Newmann, 2000; Newmann, 
Bryk, & Nagaoka, 2001). 

School Selection 

Eight public schools and five alternate public schools that planned to 
undergo conversion into smaller schools in 2003-04 were identified by 
Fouts and Associates to participate in the study. The rationale for school 
selection included a combination of the following school factors: (1) large 
size (allowing for a greater number of teachers eligible to participate), 

(2) reasonable likelihood of success converting to small schools (in the 
opinion of the team evaluating this reform effort), (3) history of 
administrative and teacher cooperation, (4) significant level of district 
support, (5) range in student ethnic diversity, and (6) range of geographic 
locations around the state. Two selected schools are in districts that have 
the Bill & Melinda Gates Foundation Model Districts grant. The 
remaining schools are involved in the Bill & Melinda Gates Foundation 
Achievers Program. 

The eight schools that participated in the study are: 

Clover Park High School A. C. Davis High School 

Henry Foss High School Foster High School 

North Central High School Mount Tahoma High School 

Port Angeles High School West Valley High School 

Teacher Selection 

Teachers were eligible if they: (1) taught English or mathematics to 
sophomore students; (2) had a class that consisted of mostly sophomore 
students, and at least 25% of all sophomores took that level of 
coursework; and (3) were likely to be teaching the same or similar types of 
courses during the 2004-05 school year. 



A- 3 




Overall, 48 teachers (24 English/language arts and 24 math) from the 8 
schools initially agreed to participate in the study. However, one 
mathematics teacher declined any further participation after submitting 
only one assignment, and one English/language arts teacher submitted all 
teacher assignments but did not submit any student work. Depending on 
school size, four to eight teachers represented each school. 

Student Selection 

All teacher participants sent their sophomore students a letter describing 
the study and giving the students and their parents the opportunity to 
choose not to participate. Teachers submitted a class list to their building 
coordinator after removing the names of the nonsophomore students and 
the names of the students who opted out. Researchers used this list to 
randomly select those students who would participate in the study. 
Depending on the number of sophomores in a class, 6 to 12 students were 
randomly selected to be participants, and 2 alternates were selected if 
available. The names of the participating students and alternates were not 
revealed to the teachers. 



Data Collection 



Assignments were collected eight times over the course of the year — four 
times in the first semester and four times in the second semester. Student 
work was collected as well during three of the collection dates. Teachers 
turned in student work for all students in the class to a coordinator at the 
school, and the coordinator made copies of the work of the randomly 
selected sample students and sent them to the data collector. All three 
student work data collections occurred during the second quarter of the 
school year. A single quarter was chosen to ensure that selected students 
did not transfer out of classes during quarter or semester breaks and to 
maximize the likelihood that they would not move during this data 
collection. The collection dates for the assignments were: 



Typical Assignments 
December 2 — 13, 20j 
January 6 - 17, 2003 
March 3- 14, 2003 
May 19-30, 2003 



Challenging Assignments 
September 9 - November 1, 2002 
November 11, 2002 - January 24, 2003 2 
February 3 - March 28, 2003 
April 7 - May 30, 2003 



Datasets 



The data collection produced two datasets (one for assignments and one 
for student work) for each of the two subjects, English/language arts and 
math. Teacher feedback information is included in the student work 



1 Student work was collected in conjunction with these assignments. 



A-4 




database. The English/language arts assignment database has data on 177 
assignments, and the student work database has data on 399 pieces of 
student work. The mathematics assignment database includes 1 84 
mathematics assignments, and the student work database contains 425 
pieces of student work. Each piece of student work can be linked to a 
teacher assignment. 

Estimation Procedures 

One goal of this project is to build on the work of Newmann et al. and 
their study of assignments and student work in the Chicago Public Schools 
(Newmann, Lopez, & Bryk, 1998; Bryk, Nagaoka, & Newmann, 2000; 
Newmann, Bryk, & Nagoaka, 2001). To that end, the estimation 
procedures described here are based on the procedures used by the 
Chicago researchers. There are two parts to these analyses. First, a Many- 
Facet Rasch Model (MFRM) analysis is used to combine the scores for the 
individual rubrics for each assignment (or piece of student work) into a 
single score of quality for that assignment. The second part of the analyses 
uses hierarchical linear modeling (HLM) conducted at the classroom level 
to examine the relationship between characteristics of the classroom 
(e.g., teacher background, student compositional variables) and the rigor 
of teacher assignments, as well as the relationships among the rigor of 
assignments, the quality of students’ work, and jurisdiction-sponsored 
standardized tests. For the purposes of this report, only the procedures 
associated with the MFRM analysis will be discussed. (The procedures for 
the HLM analyses will be discussed in a subsequent paper to be written 
for the 2004 AERA conference in April.) 

Analytic Approach Rationale: Many-Facet Rasch Measurement 

Each assignment and piece of student work received a score on each of 
three to five rubrics (the number of rubrics depends on the subject area 
and whether the article being scored was an assignment or a piece of 
student work). The rubrics and score scales for them are shown in 
Table A.l. 



A- 5 




Table A.l: Scoring Rubrics and Score Scales 

English/Language Arts 

Score Scales 

Assignments^ 

1. Construction of Knowledge 1-3 

2. Elaborated Communication 1-4 

3. Authentic Audience 1-3 

4. Student Involvement 1-4 

Student Work 

1. Construction of Knowledge 1-3 

2. Elaborated Communication 1-4 

3. Language Conventions 1-6 

Teacher Feedback 1-4 



Mathematics 

Score Scales 

Assignments 

1. Important Mathematical Content 1-4 

2. Problem Solving and Reasoning 1-4 

3. Effective Communication 1-3 

4. Relevant Contexts and Connections 1-4 

5. Student Involvement 1-4 

Student Work 

1. Conceptual Understanding 1-4 

2. Procedural Knowledge 1-4 

3. Problem Solving and Reasoning 1-4 

4. Effective Communication 1-4 

Teacher Feedback E4 

In addition, all assignments were scored again by a second scorer for each 
of the rubrics, as a check on the reliability of the scoring. Approximately 
50% of the student work in English/language arts and 40% of mathematics 
work was also double-scored. As a result, there are quite a few pieces of 
data for each assignment and each piece of work from the multiple rubrics 
and scorers, and it is more useful for the analysis if the data can be 
combined into a single score for each assignment and a single score for 
each piece of student work. 

2 The ELA assignment and work rubrics also allowed scorers to indicate if there was 
insufficient information to assign a score. 



A-6 






What is the best way to go about combining the data? A simple average or 
the sum of the raw ratings is not adequate for these purposes, because of 
two factors that are sources of variability in the raw ratings. The first 
factor relates to differences in the severity of the scorers: one scorer may 
have higher standards than another. The second is associated with 
differences in the stringency of the rubrics. For example, it may be harder 
for an English/language arts assignment to achieve a top score in 
Construction of Knowledge than in Elaborated Communication. 

Had all the assignments been rated by all the raters on all the rubrics , then 
the simple average of the ratings would have balanced out any differences 
in rater severity. However, such a massive rescoring activity was not 
feasible. Thus, we need to adjust statistically for the differences in the 
severity of the scorers and the stringency of the rubrics. We use the Many- 
Facet Rasch Measurement (Linacre, 1989a) technique to combine the 
individual raw scores from both scorers on each assignment or piece of 
work and, ultimately, to develop numeric scales to quantify the intellectual 
challenge of assignments and the overall quality of the student work. 

Scales are developed separately for the assignments and student work and 
for English/language arts (ELA) and mathematics. 

The presence of differences in rater severity and in rubric stringency is one 
of the main reasons that this Many-Facet Rasch Model calibration step is 
important. It adjusts for rater severity, in terms of the estimated measure 
for each assignment or piece of student work. Likewise, it adjusts for the 
difficulty of the rubrics. 



Theoretical Model 



The Many-Facet Rasch Model used for assignments is: 



r 

log 

V 



nijk 



nji(k-V) J 



= B„ -C. 



d j 



■ F, 



where 

P ni jk is the probability of assignment n being given a rating of k on rubric i 
by scorer j 

P nij(k-i) is the probability of assignment n being given a score of k-1 on 
rubric i by scorer j 

B n is the parameter for assignment n (quality of the assignment) 

Ci is the parameter for rubric i (stringency of the rubric) 

Dj is the parameter for scorer j (severity of the scorer) 

Fik is the parameter for receiving a rating of k relative to k-1 on rubric i 
(step difficulty). 



A-7 




The model for student work is: 



f 

log 

V 



nijk 



nji(k-l) J 



= B„ -C;-D, 



where 

P ni jk is the probability of student work n being given a score of k on rubric 
i by scorer j 

Pnij(k-i) is the probability of student work n being given a score of k-1 on 
rubric i by scorer j 

B n is the parameter for student work n (quality of the work) 

Ci is the parameter for rubric i (stringency of the rubric) 

Dj is the parameter for scorer j (severity of the scorer) 

Fik is the parameter for receiving a score of k relative to k-1 on rubric i 
(step difficulty). 



The product of the analysis is the measure of each element of three facets: 
the assignment rigor and authenticity (or student work quality), B n ; the 
rubric stringency, C,; and the scorer severity, D, (as well as the measure of 
step difficulty, F ik , which is an output of the model in which we are less 
interested). The Many-Facet Rasch Model analysis corrects the estimates 
of assignment rigor and authenticity and the quality of student work for 
scorer severity and rubric difficulty. The Rasch-adjusted measures for 
assignments and student work, and their associated standard error 
estimates, will then be used as data for the HLM analyses. Parameters for 
ELA assignments, ELA student work, mathematics assignments, and 
mathematics student work will each have their own scales, which will not 
be linked to the scales of the others. 



For example, at the end of the Many-Facet Rasch Model analysis, each 
scorer will have a parameter estimate to quantitatively represent his or her 
severity. Likewise, there will be a stringency parameter associated with 
each rubric and a parameter for each teacher assignment. All of these 
parameters are placed on a common scale so that they can be compared 
with each other. 



There is also one teacher feedback rubric for student work. Because 
teacher feedback has only one rubric, it is not included in the Rasch 
measurement analysis but is examined and reported separately. 



Rescaling 

The default setting of the FACETS program (Linacre, 1989b), which 
performs the Rasch analysis, chooses the local origins of scales such that 
the mean calibrations of the scorers, the rubrics, and the scoring scale 
structure are all zero. As a result, the local origins of the quality of 



A-8 




assignments and student work are defined by the model; that is, the mean 
of the rater measure is zero, the mean of the rubric measure is zero, and 
the mean of the step difficulties of each rubric is also zero. However, this 
means that the student work and assignment measures are not zero. 

In other words, FACETS sets the origins as the following, in relation to 
the Rasch Model described above: 

m 12 t 

£c,= o.2 D j=o.™ d Z F «=°- 

M j = i *= i 

where m is the number of rubrics (which is different for each of the four 
groups: ELA assignments, ELA student work, mathematics assignments, 
and mathematics student work), 12 is the number of scorers, and i is the 
number of score categories for rubric i. Because the means of C, D, and F 
are already determined, the mean of B (the assignment or student work 
parameter) is not constrained to equal zero. 

Because the logit measure theoretically ranges from negative infinity to 
positive infinity, it is not a scale that is easy to interpret. For reporting 
purposes, we rescale the logit measure to a 0 to 10 scale. The 
transformation formula is: 

Assignment (or student work) measure = 

10 x (logit measure - min)/(max - min) 

where logit measure is the original measure for either assignment or 
student work, min is the minimum value of the logit, and max is the 
maximum value of the logit. By the same operation, the estimated 
standard error is also transformed to the same scale by the formula: 

Standard error = original standard error x 10 (max - min) 

Overall Reliability of the Measures 

One of the important questions to ask regarding this Many-Facet Rasch 
Model analysis is to what extent the estimated Rasch scores (based on raw 
scores assigned by using the scoring rubrics) provide reliable measures of 
the rigor and authenticity of teacher assignments and the quality of student 
work. This question can be answered by examining the reliability 
estimates produced by the FACETS program for the assignments and 
student work. The reliability calculated by FACETS is the Rasch 
equivalent to the KR-20 or Cronbach Alpha statistic, the ratio of the true 
variance to the observed variance. From the FACETS output, the 
reliability statistics are 0.85 for English/language arts assignments, 0.78 
for English/language arts student work, 0.70 for mathematics assignments, 
and 0.62 for mathematics student work. Given the number of scored 



A- 9 




assignments and pieces of work and the number of score levels, these 
reliabilities are typical compared with other “tests” of similar length, while 
a reliability of 0.85 is considered very good by common standards. 

We see evidence that English/language arts shows higher reliability than 
math, and assignments show higher reliability than student work. The 
reliability of the models ranges from mathematics student work, where the 
model accounts for 62% of the variance in the data, to ELA assignments, 
where the model accounts for 85% of the variance. The square root of the 
reliability is an estimate of the correlation between the true score and the 
observed score, ranging from 0.79 (i.e., the square root of the reliability 
for mathematics student work, 0.62) for mathematics student work to 0.92 
(i.e., the square root of the reliability for ELA classroom assignments, 
0.85) for ELA assignments. The reliability estimates indicate that the 
Rasch measures correlate highly with the true scores of the student works 
and assignments. 

Facet 1 : Rasch Scores as Valid Measures of Assignments and 
Student Work 

On the basis of the high reliability coefficients discussed above, it appears 
that forming a single score for the assignments or student work based on 
the rubrics is reasonable. In addition to these reliability requirements, it is 
also important to examine how well the individual rubrics relate to one 
another: are measures that are expected to tap the same higher-order 
construct correlated, or do they seem to behave like independent 
measures? To assist in the interpretation of the Rasch measures, it is also 
important to understand how the individual rubrics correlate with the 
aggregate measure produced by the Rasch analysis. 

To get a sense of the interdependence among the rubrics, we looked at 
correlations among the raw scores for each of the rubrics and the Rasch 
measures. Table A. 2 shows the resulting correlation matrix. 

The rubrics all have positive and highly statistically significant 
correlations with each other and with the Rasch measures, which increases 
our confidence that a high score on the Rasch measure tends to represent 
high raw scores on the individual rubrics that make up the scale. The 
Rasch measure correlates more highly with each of the raw ratings than 
the raw ratings correlate with each other, suggesting that the Rasch 
measure does a good job of summarizing the separate raw measures, and 
enabling us to use a single Rasch measure in place of the separate 
measures. 



A-10 




Table A.2: Correlations among Rasch Measures and Raw Scores on 
Individual Rubrics (p-values shown below correlations ) 



ELA Assignments N=177 


Measure 


CK 


EC 


AA 


SI 


Construction of Knowledge 


0.67 


1 








< 0.0001 










Elaborated Communication 


0.80 


0.58 


1 






< 0.0001 


< 0.0001 








Authentic Audiences 


0.58 


0.26 


0.32 


1 




< 0.0001 


0.0005 


< 0.0001 






Student Involvement in Crafting 
Assignments 


0.66 


0.29 


0.46 


0.37 


1 


< 0.0001 


< 0.0001 


< 0.0001 


< 0.0001 




ELA Student Work N=399 


Measure 


CK 


EC 


LC 




Construction of Knowledge 


0.78 


1 








< 0.0001 










Elaborated Communication 


0.82 


0.58 


1 






< 0.0001 


< 0.0001 








Language Conventions and 
Resources 


0.88 


0.58 


0.70 


1 




< 0.0001 


< 0.0001 


< 0.0001 







Mathematics Assignments N=184 


Measure 


IMC 


PS/R 


EC 


RC 


Important Mathematics Content 


0.65 


1 








< 0.0001 










Problem Solving and Reasoning 


0.74 


0.39 


1 






< 0.0001 


< 0.0001 








Effective Communication 


0.55 


0.27 


0.37 


1 




< 0.0001 


0.0003 


< 0.0001 






Relevant Context and Real-World 
Connections 


0.52 


0.20 


0.36 


0.22 


1 


< 0.0001 


0.0057 


< 0.0001 


0.0029 




Student Involvement in Crafting 
Assignments 


0.42 


0.23 


0.27 


0.30 


0.18 


< 0.0001 


0.0016 


0.0002 


< 0.0001 


0.0133 


Mathematics Student Work N=425 


Measure 


cu 


PK 


PS/R 


EC 


Conceptual Understanding 


0.63 


1 








< 0.0001 










Procedural Knowledge 


0.63 


0.14 


1 






< 0.0001 


0.0036 








Problem Solving and Reasoning 


0.61 


0.52 


0.12 


1 




< 0.0001 


< 0.0001 


0.0172 






Effective Communication 


0.67 


0.40 


0.28 


0.46 


1 


< 0.0001 


< 0.0001 


< 0.0001 


< 0.0001 





A-ll 






Facet 2: Rubric Stringency 



The Rasch measure of the stringency of the rubrics indicates the difficulty 
level for each rubric; the higher the measure, the more difficult the 
rubric — that is, the harder it is to get a high score on that rubric. In this 


section, we discuss the rubric stringency measure. 




Enqlish/Lanquaqe Arts Assiqnments 




The most stringent rubric for ELA assignments is Authentic Audiences 
and the least stringent rubric is Construction of Knowledge. The measures 


of the rubrics for the ELA assignments are: 




Construction of Knowledge 


-1.74 


Elaborated Communication 


-1.07 


Authentic Audiences 


1.84 


Student Involvement in Crafting Assignments 


0.98 


Enqlish/Lanquaqe Arts Student Work 




The measures of the rubrics for ELA student work are: 




Construction of Knowledge 


-0.44 


Elaborated Communication 


-0.38 


Language Conventions and Resources 


0.83 


Mathematics Assiqnments 




The most stringent rubric for mathematics assignments is Student 
Involvement in Crafting the Assignments, with an MFRM parameter of 
0.78. The least stringent is Problem Solving and Reasoning, with the 
lowest MFRM parameter of -0.78. The measures of rubrics for the 


mathematics assignments are: 




Important Mathematical Content 


-0.30 


Problem Solving and Reasoning 


-0.78 


Effective Communication 


-0.03 


Relevant Contexts and Real-World Connections 


0.34 


Student Involvement in Crafting Assignments 


0.78 



A-12 




Mathematics Student Work 

The measures of the rubrics for mathematics student work are: 



Conceptual Understanding 


0.22 


Procedural Knowledge 


-1.87 


Problem Solving and Reasoning 


0.37 


Effective Communication 


1.28 



Facet 3: Rater Severity 

Table A. 3 lists the rubrics and illustrates the differences in scores given by 
different scorers for each rubric for those assignments and student work 
that were scored by two raters. The first column shows the percentage of 
observations that received the same rating by the two scorers (i.e., perfect 
agreement), and the second column shows the percentage of observations 
that were scored within one point. The score scales for the different 
rubrics range from 3 to 6 points. The English/language arts scorers had 
perfect agreement of 60% or more on all four of the assignments rubrics, 
all three of the student work rubrics, and the teacher feedback score 
(column 1). They had at least 90% agreement within one point for all but 
one rubric (column 2). There was much more variation among the 
mathematics scorers, particularly for assignments, which ranged from 44% 
perfect agreement on Important Mathematical Content, to 91% perfect 
agreement on the Student Involvement rubric. However, the percentages 
of agreement within one point for mathematics scores are comparable to 
those of English/language arts. The teacher feedback ratings between 
scorers are quite consistent for mathematics, with 100% agreement within 
one point. 

Differences among raters could be due to a variety of factors, such as the 
clarity and specificity of the scoring rubrics (including the clarity and 
specificity of the benchmarks used in the rubrics), the effectiveness of the 
scorer training, and the nature of the assignments themselves. 



A-13 




Table A. 3: Agreement Rates on Assignments and Student Work, by Rubric 
(for assignments and work rated by two scorers ) 



English/Language Arts 




Perfect 


Agreement 




Agreement 


within 1 point 


Assignments 


Construction of Knowledge 


64% 


96% 


Elaborated Communication 


60% 


93% 


Authentic Audience 


69% 


94% 


Student Involvement in Crafting 


67% 


79% 


Assignments 


Student Work 


Construction of Knowledge 


63% 


94% 


Elaborated Communication 


60% 


90% 


Language Conventions and Resources 


69% 


91% 


Teacher Feedback 


67% 


95% 



Mathematics 




Perfect 


Agreement 




Agreement 


within 1 point 


Assignments 


Important Mathematical Content 


44% 


92% 


Problem Solving and Reasoning 


52% 


88% 


Effective Communication 


72% 


99% 


Relevant Contexts and Real-world 


68% 


93% 


Connections 

Student Involvement in Crafting 


91% 


99% 


Assignments 


Student Work 


1. Conceptual Understanding 


68% 


89% 


2. Procedural Knowledge 


45% 


79% 


3. Problem Solving and Reasoning 


83% 


93% 


4. Effective Communication 


73% 


99% 


Teacher Feedback 


92% 


100% 



From the MFRM, the scorer measures indicate the severity of each scorer. 
The more severe scorer gives consistently lower scores to assignments of 



A-14 






the same quality, compared with the less severe scorer. The ranges of the 
scorer measures (i.e., the differences between the most and least severe 
scorers) are 0.83, 0.43, 1.65, and 0.98 standard deviations for 
English/language arts assignments, English/language arts student work, 
mathematics assignments, and mathematics student work, respectively. 
The ranges show that the raters for the mathematics teacher assignments 
have the largest disparity in severity, a difference of 1.65 standard 
deviations. The scores for the English/language arts student work have the 
smallest range in severity (i.e., 0.43 standard deviation). The wide 
disparity in the severity for the scores for mathematics teacher 
assignments may cause concerns about fairness in scoring. This is 
especially worrisome when not all scorers rate all assignments. In the 
future, the scorer training will attempt to close the severity gap. In 
addition, the FACETS output indicates that raters differ in severity at 
different rubric score levels. Thus, rater training will also focus on 
decreasing the possibility of differences between scorers and levels of 
scoring on various rubrics. 

Future Data Collections 

The schedule for future data collections is shown in Table A. 4. In 2003-04 
we are collecting assignments and work from 12 small start-up high 
schools and 4 additional pre-conversion schools. In 2004-05 we will again 
collect data from these same 12 small start-up high schools so we can 
examine changes over time in these schools. In addition, we will collect 
data from 8 large traditional high schools so these data can be compared 
data for to the 12 small start-ups. The 8 Washington pre-conversions from 
the pilot year will have converted into small learning communities and 
will be in their second year of conversion. We expect to collect data from 
2 small learning communities for each of the original 8 pre-conversions 
and plan to use the same teachers, if possible. 

In the final year of data collection, we will follow up on the 4 pre- 
conversions from Year 1; by then they will have been in conversion for 
over a year. As with the Washington schools, we will collect data from 2 
of the smaller learning communities from each of the 4 originally large 
schools, and attempt to get assignments from the same teachers. 



A-15 




Table A.4: Data Collection Schedule 



School Type 


Pilot Year 
(2002-03) 


Year 1 
(2003-04) 


Year 2 
(2004-05) 


Year 3 
(2005-06) 


Startups 




12 


12 (same as yr 1) 




Comparison 






8 




Pre-conversion 


8 WA 


4 non-WA 


n 


171 


Conversion 






16 WA U 


8 non-WA 1 ^ 


TOTAL 


8 


16 


36 


8 



3 Two small learning communities from each of the eight Washington schools. 

4 Two small learning communities from each of the four converting schools outside 
Washington State. 



A-16 




